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FIGURE 1 

Plasmid sequence of pNC5LSPCEAp53 (pMC30B5) for vCP2086 
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GCCCTTT 
CGGGAAA 
ACAGCTT 
TGTOGAA 
GTCGGGG 
CAGCCCC 
ACCGCAC 
TQGOGTG 
GAAGGGC 
CTTCCCG 
TAAGTTG 
ATTCAAC 



CGTCTCG 
GCAGAGC 
GTCTGTA 
CAGACAT 
CTGGCTT 
GACCGAA 
AGATGCG 
TCTACGC 
GATCGGT 
CTAGCCA 
GGTAACG 
CCATTGC 



CGCGTTT 
6CGCAAA 
AGCGGAT 
TCGCCTA 
AACTATG 
TTGATAC 
TAAGGAG 
ATTCCTC 
GCGGGCC 
CGCCCGG 
CCAGGGT 
GGTCCCA 



CGGTGAT 
GCCACTA 
GCCGGGA 
CGGCCCT 
CGGCATC 
GCCGTAG 
AAAATAC 
TTTTATG 
TCTTCGC 
AGAAGCG 
TTTCCCA 
AAAGGGT 



GACGGTG 
CTGCCAC 
GCAGACA 
OGTCTGT 
AGAGCAG 
TCTCGTC 
CGCATCA 
GCGTAGT 
TATTACG 
ATAATGC 
GTCACGA 
CAGTGCT 



AAAACCT 
TTTTGGA 
AGCCCGT 
TCGGGCA 
ATTGTAC 
TAACATG 
GGCGCCA 
CCGCGGT 
CCAGCTG 
GGTCGAC 
CGTTGTA 
GCAACAT 



CTGACAC 
GACTGTG 
CAGGGCG 
GTCCCGC 
TGAGAGT 
ACTCTCA 
TTCGCCA 
AAGGGGT 
GCGAAAG 
CGCTTTC 
AAACGAC 
TTTGCTG 



ATGCAGC 
TACGTCG 
CGTCAGC 
GCAGTCG 
GCACCAT 
CGTGGTA 
TTCAGGC 
AAGTCCG 
GGGGATG 
CCCCTAC 
GGCCAGT 
CCGGTCA 



TCCCGGA 
AGGGCCT 
GGGTGTT 
CCCACAA 
ATGCGGT 
TAOGCCA 
TGCGCAA 
ACGCGTT 
TGCXGCA 
ACGACGT 
GCCAAGC 
OGGTTCG 



GACGGTC 
CTGOCAG 
GGCGGGT 
CCGCCCA 
GTGAAAT 
CACTTTA 
CTGTTGG 
GACAACC 
AGGCGAX 
TCCGCTA 
TTGGCTG 
AACOGAC 



Left Arm 



20 


421 
491 


CAGGTAT 
GTCCATA 

TTTGGTT 
AAACCAA 


TCTAAAC 
AGATTTG 

TTTCATA 
AAAGTAT 


TAGGAAT 
ATCCTTA 

ATCATAA 
TAGTATT 


25 


561 


GTAGTAT 
CATCATA 


AGACTTA 
TCTGAAT 


TACTTTG 
ATGAAAC 




631 


ACAACAA 
TGTTGTT 


TAATCAT 
ATTAGTA 


CGTCGTC 
GCAGCAG 


30 


701 


ACATCAT 
TGTAGTA 


CTGAATC 
GACTXAG 


AATAAAC 
TTATTTG 


35 


771 


TGCTCAT 
ACGAGTA 


GATGTAC 
CTACATG 


AAAAAAA 




841 


ACTAGTC 
TGATCAG 


ATAAAAA 
TATTTTT 


CCCGGGA 
GGGCCCT 



AGATGAA ATTATGT 
TCTACTT TAATACA 

Left Arm 
TCTAACA ACATTTT 
AGATTGT TGTAAAA 

Left Arm 
TAACCAT AGTATAC 
ATTGGTA TCATATG 

Left Arm 
ATCTTCA TCTTCAT 
TAGAAGT AGAAGTA 

Left Arm 
ATAGAAC GGTATAG 
TATCTTG CCATATC 

Left Arm 
CATTATT TAGAAAT 
GTAATAA ATCTTTA 

Left Arm 



GCAAAGG 
CGTTTCC 


AGATACC 
TCTATGG 


TTTAGAT 
AAATCTA 


ATGGATC 
TACCTAG 


TGATTTA 
ACTAAAT 


CACTATA 
GTGATAT 


CTATACC 
GATATGG 


TTCTTGC 
AAGAACG 


ACAAGTC 
TGTTCAG 


GCCATTA 
CGGTAAT 


TTTAGCG 
AAATCGC 


CGTCATC 
GCAGTAG 


TTCTTCA 
AAGAAGT 


TCTAAAA 
AGATTTT 


CAGATTT 
GTCTAAA 


TAAAGTT 
ATTTCAA 


TTCATAT 
AAGTATA 


TCAATAA 
AGTTATT 


CTTTCTT 
GAAAGAA 


TTCTAAA 
AAGATTT 


AGCGTTA 
TCGCAAT 


ATCTCCA 
TAGAGGT 


TTGTAAA 
AACATTT 


ATATACT 
TATATGA 


AACGOGT 
TTGCGCA 


TATGCAT 
ATACGTA 


TTTAGAT 
AAATCTA 


CTTTATA 
GAAATAT 


AGCGGCC 
TCGCCGG 


GTGATTA 
CACTAAT 


GAGATAA 
CTCTATT 


AAACTAT 
TTTGATA 


ATCAGAG 
TAGTCTC 


CAACCCC 
GTTGGGG 


AACCAGC 
TTGGTCG 



CEA 



***Ile LeuAla ValGly ValLeuVal- 
ACTCCAA TCATGAT GCCGACA GTGGCCC CAGCTGA GAGACCA GGAGAAG TTCCAGA TGCAGAG ACTGTGA 
TGAGGTT AGTACTA CGGCTGT CACCGGG GTCGACT CTCTGGT CCTCTTC AAGGTCT ACGTCTC TGACACT 

CEA 

..Glylle Metlle GlyValTbr AlaGly AlaSer LeuGlyPro SerThr GlySer AlaSerVal Thrlle* 
TGCTCTT GACTATG GAATTAT TGCGGCC AGTAGCC AAGTTAG AGACAAA ACAGGCA TAGGTCC CGTTATT 
ACGAGAA CTGATAC CTTAATA ACGCCGG TCATCGG TTCAATC TCTGTTT TGTCCGT ATCCAGG GCAATAA 

CEA 

.SerLys VallleSer AsnAsn ArgGly ThrAIaLeu AsnSer ValPhe CysAlaTyr ThrGly AsnAsn 
ATTTGGC GTGATTT TGGCGAT AAAGAGA ACTTGTG TGTGTTG CTGCGGT ATCCCAT TGATACG CCAAGAA 
TAAACCG CACTAAA ACCGCTA TTTCTCT TGAACAC ACACAAC GACGCCA TAGGGTA ACTATGC GGTTCTT 

CEA 

AsnProThr IleLys Alalle PheLeuVal GlnThr HisGln GlnProIle GlyAsn IleArg TrpSerTyr- 
TACTGCG GGGATGG GTTAGAG GCCGAGT GGCAGGA GAGGTTG AGGTCCG CTCCCGA AAGGTAA GACGAGT 
ATGACGC CCCTACC CAATCTC CGGCTCA CCGTCCT CTCCAAC TCCAGGC GAGGGCT TTCCATT CTGCTCA 

CEA 

. .GlnPro SerPro AsnSerAla SerHia CysSer LeuAsnLeu AspAla GlySer LeuTyrSer SerAep- 
CTGGGGG GGAAATG ATGGGGG TGTCCGG CCCATAG AGGACAT CCAGGGT GACTGGG TCACTGC GGTTTGC 
GACCCCC CCTTTAC TACCCCC ACAGGCC GGGTATC TCCTGTA GGTCCCA CTGACCC AGTGACG CCAAACG 

CEA 

.ProPro Serllelle ProThr AspPro GlyTyrLeu ValAsp LeuThr ValProAsp SerArg AanAla 
ACTCACT GAGTTCT GGATTCC ACATACA TAGGCTC TTGCGTC ATTTCTT GTGACAT TGAATAG AGTGAGG 
TGAGTGA CTCAAGA CCTAAGG TGTATGT ATCCGAG AACGCAG TAAAGAA CACTGTA ACTTATC TCACTCC 

CEA 

SerValSer AsnGln IleGly CysValTyr AlaArg AlaAsp AsnArgThr ValAan PheLeu ThrLeuThr- 
GTCCTGT TGCCATT GGACAGC TGCAGCC TGGGACT GACTGGG AGGCTCT GACCATT TACCCAC CACAGGT 
CAGGACA ACGGTAA CCTGTCG ACGTCGG ACCCTGA CTGACCC TCCGAGA CTGGTAA ATGGGTG GTGTCCA 

CEA 

. .ArgAsn GlyAsn SerLeuGln LeuArg ProSer ValProLeu SerGln GlyAsn ValTrpTrp LeuTyx- 
AGGTTGT GTTCTGA GCCTCAG GTTCACA GGTGAAG GCCACAG CATCCTT GTCCTCC ACGGGTT TGGAGTT 
TCCAACA CAAGACT CGGAGTC CAAGTGT CCACTTC CGGTGTC GTAGGAA CAGGAGG TGCCCAA ACCTCAA 

CEA 

.ThrThr AanGlnAla GluPro GluCys ThrPheAla ValAla AspLys AspGluVal ProLys SerAsn 
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1471 GTTGCTG GAGATGG AGGGCTT GGGCAGC TCCGCGG AAACAGT TATTGTT TTAACTG TAGTCCT GCTGTGA 

CAACGAC CTCTACC TCCCGAA CCCGTCG AGGCGCC TTTGTCA ATAACAA AATTGAC ATCAQGA CGACACT 

CSA 

AsnSerSer IleSer ProLys ProLeuGlu AlaSer ValThr IleThrLys ValThr ThrAxg SerHisGly 
5 1541 CCACTGG CTGAGTT ATTGGCC TGGCAAG TATAGAG TCCGCTG TTCTTCT CAGTTAT GTTGCTT ATAAATA 

GGTGACC GACTCAA TAACCGG ACCGTTC ATATCTC AGGCGAC AAGAAGA GTCAATA CAACGAA TATTTAT 

CEA 

..SerAla SerAsn AsnAlaGln CysThr TyrLeu GlySerAsn LysGlu Thrlle AsnSerlle PheLeu- 
1611 ACTCTTG AGTATGC TGCTGAA TGTTTCC ATCAATC AGCCAGG AGTACTG TGCAGGG GGGTTGG ATGCTGC 

10 TGAGAAC TCATACG ACGACTT ACAAAGG TAGTTAG TCGGTCC TCATGAC AOGTCCC CCCAACC TACGACG 

CEA 

.GluGln ThrHisGln Glnlle AsnGly AspIleLeu TrpSer TyrGln AlaProPro AsnSer AlaAla 
1681 ATGGCAA GAAAGGC TCAAGTT CACGCCG GGAOGGT AGTAGGT GTATGAT GGAGATA TAGTTGG GTCGTCT 

TACCGTT CTTTCCG AGTTCAA GTGCGGC CCTGCCA TCATCCA CATACTA CCTCTAT ATCAACC CAGCAGA 
15 CEA 

HisCysSer LeuSer LeuAsn ValGlyPro ArgTyr TyrThr TyrSerPro Serlle ThrPro AspAspPro- 
1751 GGGCCAT ACAAAAC ATTAAGG ATAACAG GGTCGGA GTGATCA ACGGATA ATTCATT CTGAATG CCACACT 

CCCGGTA TGTTTTG TAATTCC TATTGTC CCAGCCT CACTAGT TGCCTAT TAAGTAA GACTTAC GGTGTGA 

CEA 

20 ..GlyTyr LeuVal AsnLeuIle ValPro AspSer HisAspVal SerLeu GluAsn GlnlleGly CysGlu • 

1821 CATAAGG TCCTACA TCATTGC GAGTAAC GGACAGG AGTGTCA ATGTGOG GTTATCA TTAGACA ACTGCAA 

GTATTCC AGGATGT AGTAACG CTCATTG CCTGTCC TCACAGT TACACGC CAATAGT AATCTGT TGACGTT 

CEA 

.TyrPro GlyValAsp AsnArg ThrVal SerLeuLeu ThrLeu ThrArg As nAspA sn SerLeu GlnLeu 
25 1891 GCGTGGG CTAACCG GCAAACT TTGGTTA TTGACCC ACCATAA ATAAGTG GTATTTT GAATCXC TGGCTCA 

CGCACCC GATTGGC CGTTTGA AACCAAT AACTGGG TGGTATT TATTCAC CATAAAA CTTAGAG ACCGAGT 

CEA 

ArgProSer ValPro LeuSer GlnAsnAsn ValTrp TrpLeu TyrThrThr AsnGln IleGlu ProGluCys- 
1961 CAAGTTA ATGCAAC TGCGTCC TCATCCT CAACTGG GTTAGAA TTGTTAC TAGTTAT GAATGGT TTTGGTG 

30 GTTCAAT TACGTTG ACGCAGG AGTAGGA GTTGACC CAATCTT AACAATG ATCAATA CTTACCA AAACCAC 

CEA 

. .ThrLeu AlaVal AlaAspGlu AspGlu ValPro AsnSerAsn AsnSer Thrlle PheProLye ProPro- 
2031 GCTCATA CACGGTA ATCGTCG TCACGGT TGTGCGG TTGAGTC CGGTGTC GCTATTG TGAGCTT GGCACGT 

CGAGTAT GTGCCAT TAGCAGC AGTGCCA ACACGCC AACTCAG GCCACAG CGATAAC ACTCGAA CCGTGCA 
35 CEA 

.GluTyr ValThrlle ThrThr ValThr ThrArgAsn LeuGly ThrAsp SerAsnHis AlaGln CysThr 
2101 GTAGGAT CCACTAT TGTTCAC GGTAATA TTGGGAA TGAACAG TTCCTGG GTGGACT GTTGGAA AGTGCCA 

CATCCTA GGTGATA ACAAGTG CCATTAT AACCCTT ACTTGTC AAGGACC CACCTGA CAACCTT TCACGGT 

CEA 

40 TyrSerGly SerAsn AsnVal ThrlleAsn Prolle PheLeu GluGlnThr SerGln GlnPhe ThrGlyAsn • 

2171 TTGACAA ACCAGCT GTATTGG GCGGGAG GATTGCT AGCGGCA TGACAGC TCAGATT CAGATTT TCCCCTG 

AACTGTT TGGTOGA CATAACC CGCCCTC CTAACGA TCGCCGT ACTGTCG AGTCTAA GTCTAAA AGGGGAC 

CEA 

..ValPhe TrpSer TyrGlnAla ProPro AsnSer AlaAlaHis CysSer LeuAsn LeuAsnGlu GlySer- 
45 2241 ATCTATA GCTTGTG TTTAGAG GGCTGAT TGTAGGA GCATCGG GTCCGTA AAGCACG TTGAGAA TCACTGA 

TAGATAT CGAACAC AAATCTC CCGACTA ACATCCT CGTAGCC CAGGCAT TTCGTGC AACTCTT AGTGACT 

CEA 

.ArgTyr SerThrAsn LeuPro Serlle ThrProAla AspPro GlyTyr LeuValAsn Leulle ValSer 
2311 ATCAGAC CTCCTGG CGCTGAC TGGATTT TGGGTTT CGCATTT GTAGCTT GCTGTGT CGTTCCT GGTCACG 

50 TAGTCTG GAGGACC GCGACTG ACCTAAA ACCCAAA GCGTAAA CATOGAA CGACACA GCAAGGA CCAGTGC 

CEA 

AspSerArg ArgAla SerVal ProAsnGln ThrGlu CysLys Tyr SerAla ThrAsp AsnArg ThrValAsn- 
2381 TTAAACA GGGTCAG AGTTCTA TTTCCGT TGCTGAG TTGGAGT CTAGGGG ACACAGG CAGGGAC TGGTTGT 

AATTTGT CCCAGTC TCAAGAT AAAGGCA ACGACTC AACCTCA GATCCCC TGTGTCC GTCCCTG ACCAACA 
55 CEA 

..PheLeu ThrLeu ThrArgAsn GlyAsn SerLeu GlnLeuArg ProSer ValPro LeuSerGln AsnAsn- 
2451 TCACCCA CCAGAGA TATGTTG CGTCTTG AGTTTCG GGCTCGC ATGTAAA AGCGACG GCATCTT TGTCTTC 

AGTGGGT GGTCTCT ATACAAC GCAGAAC TCAAAGC CCGAGCG TACATTT TCGCTGC CGTAGAA ACAGAAG 

CEA 

60 .ValTrp TrpLeuTyr ThrAla AspGln ThrGluPro GluCys ThrPhe AlaValAla AspLys AspGlu 

2521 GACAGGC TTACTAT TATTGGA GCTAATA GAAGGCT TAGGGAG TTCCGGG TATACCC GGAACTG GCCAGTT 

CTGTCCG AATGATA ATAACCT CGATTAT CTTCCGA ATCCCTC AAGGCCC ATATGGG CCTTGAC CGGTCAA 

CEA 

ValProLys SerAsn AsnSer SerlleSer ProLys ProLeu GluProTyr ValArg PheGln GlyThrAla- 
65 2591 GCTTCTT CATTCAC AAGATCT GACTTTA TGACGTG TAGGGTG TAGAATC CTGTGTC ATTCTGG ATGATGT 

CGAAGAA GTAAGTG TTCTAGA CTGAAAT ACTGCAC ATCCCAC ATCTTAG GACACAG TAAGACC TACTACA 

CEA 

..GluGlu AsnVal LeuAspSer Lyslle ValHis LeuThrTyr PheGly ThrAsp AsnGlnlle IleAsn- 
2661 TCTGGAT CAGCAGG GATGCAT TGGGGTA TATTATC TCTCGAC CACTGTA TGCGGGC CCTGGGG TAGCTTG 

70 AGACCTA GTCGTCC CTACGTA ACCCCAT ATAATAG AGAGCTG GTGACAT ACGCCCG GGACCCC ATCGAAC 

CEA 

.Glnlle LeuLeuSer AlaAsn ProTyr IlelleGlu ArgGly SerTyr AlaProGly ProThr AlaGln 
2731 TTGAGTT CCTATTA CATATCC TATAATT TGACGGT TGCCATC CACTCTT TCACCTT TGTACCA GCTGTAG 

AACTCAA GGATAAT GTATAGG ATATTAA ACTGCCA ACGGTAG GTGAGAA AGTGGAA ACATGGT CGACATC 
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CBA 

GlnThrGly IleVal TyrGly IlelleGln ArgAsn GlyAsp ValArgGlu GlyLys TyrTip SerTyrGly 
CCAAAAA GATGCTG GGGCAGA TTGTGGA CAAGTAG AAGCACC TCCTTCC CCTCTGC GACATTG AACGGCG 
GGTTTTT CTACGAC CCCGTCT AACACCT GTTCATC TTCGTGG AGGAAGG GGAGACG CTGTAAC TTGCCGC 

CBA 

. .PheLeu HisGln ProLeuAsn HisVal LeuLeu LeuValGlu LysGly GluAla ValA3nPhe ProThr- 
TGGATTC AATAGTG AGCTTGG CAGTGGT GGGCGGG TTCCAGA AGGTTAG AAGTGAG GCTGTGA GCAGGAG 
ACCTAAG TTATCAC TCGAACC GTCACCA CCCGCCC AAGGTCT TCCAATC TTCACTC CGACACT CGTCCTC 

CBA 

. SerGlu IleThrLeu LysAIa ThrThr ProProAen TrpPhe ThrLeu LeuSerAla ThrLeu LeuLeu 
CCTCTGC CAGGGGA TGCACCA TCTGTGG GGAGGGG CCGAGGG AGACTCC ATTATTT AT ATT CC AAAAAAA 
GGAGACG GTCCCCT ACGTGGT AGACACC CCTCCCC GGCTCCC TCTGAGG TAATAAA TATAAGG TTTTTTT 



E/L Promoter 



CBA 

ArgGlnTxp Prolle CysTrp ArgHisPro ProAla SerPro SerGluMet 

H6 promoter 



AAAAATA AAATTTC AATTTTT GTCGACC TGCAGCT CGACGGA TCCCCCC GGGTTCT TTATTCT ATACTTA 
TTTTTAT TTTAAAG TTAAAAA CAGCTGG ACGTCGA GCTGCCT AGGGGGG CCCAAGA AATAAGA TATGAAT 

E/L Promoter 

H6 promoter 



AAAAGTG AAAATAA ATACAAA GGTTCTT GAGGGTT GTGTTAA ATTGAAA GCGAGAA ATAATCA TAAATTA 
TTTTCAC TTTTATT TATGTTT CCAAGAA CTCCCAA CACAATT TAACTTT CGCTCTT TATTAGT ATTTAAT 

P 53 



H6 promoter 



MetGlu GluProGln SerAsp ProSer ValGluPro 
TTTCATT ATCGCGA TATCCGT TAAGTTT GTATCGT AATGGAG GAGCCGC AGTCAGA TCCTAGC GTCGAGC 
AAAGTAA TAGCGCT ATAGGCA ATTCAAA CATAGCA TTACCTC CTCGGCG TCAGTCT AGGATCG CAGCTCG 

P 53 



. . ProLeu SerGln GluThrPhe SerAsp LeuTrp LysLeuLeu ProGlu AanAsn ValLeuSer ProLeu 
CCCCTCT GAGTCAG GAAACAT TTTCAGA CCTATGG AAACTAC TTCCTGA AAACAAC GTTCTGT CCCCCTT 
GGGGAGA CTCAGTC CTTTGTA AAAGTCT GGATACC TTTGATG AAGGACT TTTGTTG CAAGACA GGGGGAA 

P 53 



.ProSer GlnAlaMet AspAsp LeuMet LeuSerPro Asp Asp IleGlu GlnTrpPhe ThrGlu AspPro 
GCCGTCC CAAGCAA TGGATGA TTTGATG CTGTCCC CGGACGA TATTGAA CAATGGT TCACTGA AGACCCA 
CGGCAGG GTTCGTT ACCTACT AAACTAC GACAGGG GCCTGCT ATAACTT GTTACCA AGTGACT TCTGGGT 

P 53 



GlyProAsp GluAla ProArg MetProGlu AlaAla ProPro ValAlaPro AlaPro AlaAla ProThrPro 
GGTCCAG ATGAAGC TCCCAGA ATGCCAG AGGCTGC TCCCCCC GTGGCCC CTGCACC AGCAGCT CCTACAC 
CCAGGTC TACTTCG AGGGTCT TACGGTC TCCGACG AGGGGGG CACCGGG GACGTGG TCGTCGA GGATGTG 

pS3 



. .AlaAla ProAla ProAlaPro SerTrp ProLeu SerSerSer ValPro SerGln LysThrTyr GlnGly 
CGGCGGC CCCTGCA CCAGCCC CCTCCTG GCCCCTG TCATCTT CTGTCCC TTCCCAG AAAACCT ACCAGGG 
GCCGCCG GGGACGT GGTCGGG GGAGGAC CGGGGAC AGTAGAA GACAGGG AAGGGTC TTTTGGA TGGTCCC 

p53 



SerTyr GlyPheArg LeuGly PheLeu HisSerGly ThrAla LysSer ValThrCys ThrTyr SerPro 
CAGCTAC GGTTTCC GTCTGGG CTTCTTG CATTCTG GGACAGC CAAGTCT GTGACTT GCACGTA CTCCCCT 
GTCGATG CCAAAGG CAGACCC GAAGAAC GTAAGAC CCTGTCG GTTCAGA CACTGAA CGTGCAT GAGGGGA 

P 53 



AlaLeuAsn LysMet PheCys GlnLeuAla LysThr CysPro ValGlnLeu TrpVal AspSer ThrProPro 
GCCCTCA ACAAGAT GTTTTGC CAACTGG CCAAGAC CTGCCCT GTGCAGC TGTGGGT TGATTCC ACACCCC 
CGGGAGT TGTTCTA CAAAACG GTTGACC GGTTCTG GAGGGGA CACGTCG ACACCCA ACTAAGG TGTGGGG 

p53 



..ProGly ThrArg ValArgAla MetAla IleTyr LysGlnSer GlnHis MetThr GluValVal ArgArg 
CGCCCGG CACCCGC GTCCGCG CCATGGC CATCTAC AAGCAGT CACAGCA CATGACG GAGGTTG TGAGGCG 
GCGGGCC GTGGGCG CAGGCGC GGTACCG GTAGATG TTCGTCA GTGTCGT GTACTGC CTCCAAC ACTCCGC 

P53 



.CysPro HisHisGlu ArgCys SerAsp SerAspGly LeuAla ProPro GlnHisLeu ileArg ValGlu 
CTGCCCC CACCATG AGCGCTG CTCAGAT AGCGATG GTCTGGC CCCTCCT CAGCATC TTATCCG AGTGGAA 
GACGGGG GTGGTAC TCGCGAC GAGTCTA TCGCTAC CAGACCG GGGAGGA GTCGTAG AATAGGC TCACCTT 
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PS3 

GlyAsnLeu ArgVal GluTyr LeuAspAsp ArgAsn ThrPhe ArgHisSer Valval ValPro TyrGluPro- 
GGAAATT TGCGTGT GGAGTAT TTGGATG ACAGAAA CACTTTT CGACATA GTGTGGT GGTGCCC TATGAGC 
CCTTTAA ACGCACA CCTCATA AACCTAC TGTCTTT GTGAAAA GCTGTAT CACACCA CCACGGG ATACTCG 

P 53 



. .ProGlu ValGly SerAspCys ThrThr IleHis TyrAsnTyr MetCys AsnSer SerCysMet GlyGly 
CGCCTGA GGTTGGC TCTGACT GTACCAC CATCCAC TACAACT ACATGTG TAACAGT TCCTGCA TGGGCGG 
GCGGACT CCAACCG AGACTGA CATGGTG GTAGGTG ATGTTGA TGTACAC ATTGTCA AGGACGT ACCCGCC 

P53 

.MetAsn ArgArgPro IleLeu Thrlle IleThrLeu GluAsp SerSer GlyAsnLeu LeuGly ArgAsn 
CATGAAC CGGAGGC CCATCCT CACCATC ATCACAC TGGAAGA CTCCAGT GGTAATC TACTGGG ACGGAAC 
GTACTTG GCCTCCG GGTAGGA GTGGTAG TAGTGTG ACCTTCT GAGGTCA CCATTAG ATGACCC TGCCTTG 

P 53 



SerPheGlu ValArg ValCys AlaCysPro GlyArg AspArg ArgThrGlu GluGlu AsnLeu ArgLysLys- 
AGCTTTG AGGTGCG TGTTTGT GCCTGTC CTGGGAG AGACCGG CGCACAG AGGAAGA GAATCTC CGCAAGA 
TCGAAAC TCCACGC ACAAACA CGGACAG GACCCTC TCTGGCC GCGTGTC TCCTTCT CTTAGAG GCGTTCT 

p53 



. .GlyGlu ProHis HisGluLeu Pro Pro GlySer ThrLysArg AlaLeu ProAen AsnThrSer SerSer - 
AAGGGGA GCCTCAC CACGAGC TGCGCCC AGGGAGC ACTAAGC GAGCACT GCCCAAC AACACCA GCTCCTC 
TTCCCCT CGGAGTG GTGCTCG ACGGGGG TCCCTOG TGATTCG CTCGTGA CGGGTTG TTGTGGT GGAGGAG 

p53 

.ProGln ProLysLys LysPro LeuAsp GlyGluTyr PheThr LeuGln IleArgGly ArgGlu ArgPhe 
TCCCCAG CCAAAGA AGAAACC ACTGGAT GGAGAAT ATTTCAC CCTTCAG ATCCGTG GGGGTGA GCGCTTC 
AGGGGTC GGTTTCT TCTTTGG TGACCTA CCTCTTA TAAAGTG GGAAGTC TAGGCAC CCGCACT CGCGAAG 

P53 

GluMetPhe ArgGlu LeuAen GluAlaLeu GluLeu LysAsp AlaGlnAla GlyLys <31uPro GlyGlySer- 
GAGATGT TCCGAGA GCTGAAT GAGGCCT TGGAACT CAAGGAT GCCCAGG CTGGGAA GGAGCCA GGGGGGA 
CTCTACA AGGCTCT CGACTTA CTCCGGA ACCTTGA GTTCCTA CGGGTCC GACCCTT CCTCGGT CCCCCCT 

P 53 



. ArgAla HisSer SerHisLeu LysSer LysLys GlyGlnSer ThrSer ArgHis I»ysLysLeu MetPhe- 
GCAGGGC TCACTCC AGCCACC TGAAGTC CAAAAAG GGTCAGT CTACCTC CCGCCAT AAAAAA C TCATGTT 
CGTCCCG AGTGAGG TCGGTGG ACTTCAG GTTTTTC CCAGTCA GATGGAG GGCGGTA TTTTTTG AGTACAA 
p53 

.LysThr GluGlyPro AspSer Asp*** 

CAAGACA GAAGGGC CTGACTC AGACTGA ACGCGTT TTTTATC CCGGGCT CGAGGGT ACCGGAT CCTTTTT 
•GTTCTGT CTTCCCG GACTGAG TCTGACT TGCGCAA AAAATAG GGCCCGA GCTCCCA TGGCCTA GGAAAAA 
ATAGCTA ATTAGTC ACGTACC TTTGAGA GTACCAC TTCAGCT ACCTCTT TTGTGTC TCAGAGT AACTTTC 
TATCGAT TAATCAG TGCATGG AAACTCT CATGGTG AAGTCGA TGGAGAA AACACAG AGTCTCA TTGAAAG 



TTTAATC 
AAATTAG 

TTGAAAA 
AACTTTT 

GGACATT 
CCTGTAA 

GGAGTAC 
CCTCATG 

AATTTTT 
TTAAAAA 

ATGAGCC 
TACTCGG 

CCGCTGT 
GGCGACA 

ACACCGC 
TGTGGCG 



AATTCCA 
TTAAGGT 



GTAGCCT 
CATCGGA 



TTATTTT 
AATAAAA 



GTCCCAT 
CAGGGTA 



ATTTGAT 
TAAACTA 



AATAGAA 
TTATCTT 



AAACAGT 
TTTGTCA 



GAGCACT 
CTCGTGA 



TTTTAAG 
AAAATTC 



GTTATCC 
CAATAGG 



ACGGCTA 
TGCCGAT 



GTTTAAC 
CAAATTG 



AAAGAAC 
TTTCTTG 



ATCATGC 
TAGTACG 



ATTGTTT 
TAACAAA 



AGACAAG 
TCTGTTC 



TTTTATA GCCTCGG TATTCTT 



ATATGAT TTTCCAT 
TATACTA AAAGGTA 

Right Arm 
TCTTTTC TACCATG 
AGAAAAG ATGGTAC 

Right Arm 
TAGTGTG CTACATA 
ATCACAC GATGTAT 

Right Arm 
AGCAAGT CAGTATC 
TCGTTCA GTCATAG 

Right Arm 
TATGTAG AGGAGTT 
ATACATC TCCTCAA 

Right Arm 
CAAATTA ACTTTGT 
GTTTAAT TGAAACA 

Right Arm 
ACATAGT TATTCTT 
TGTATCA ATAAGAA 

Right Arm 
AAGTTGT GCATTCA 
TTCAACA CGTAAGT 

Right Arm 
GAACATT ACAGCCA 



Right Arm 

TTCTTTC AAAGATG TAGTTTA CATCTGC TCCTTTG 
AAGAAAG TTTCTAC ATCAAAT GTAGACG AGGAAAC 

AATTACA GCTGGCA AGATCAA TTTTTCC CAGTTCT 
TTAATGT CGACCGT TCTAGTT AAAAAGG GTCAAGA 

TTTCAAT ATTTCCA GATTGTA CAGCGAT CATTAAA 
AAAGTTA TAAAGGT CTAACAT GTCGCTA GTAATTT 



AGCACCT 
TCGTGGA 

AACCGAT 
TTGGCTA 

TAAGGTA 
ATTCCAT 

CAACAGA 
GTTGTCT 

GTAACTA 
CATTGAT 



TTGTTCA 
AACAAGT 



CCGTGTT 
GGCACAA 



AGCTGCC 
TCGACGG 



TCTTTCA 
AGAAAGT 



CAGGTTT 
GTCCAAA 



ATAGAAG 
TATCTTC 



TGAAATA 
ACTTTAT 



AAACACA 
TTTGTGT 



CTATTTT 
GATAAAA 



AGCTCCA 
TCGAGGT 



TTTAACC 
AAATTGG 



ATTGTTA 
TAACAAT 



TCTACAT 
AGATGTA 



AAGGAGT 
TTCCTCA 



GTAGTCG 
CATCAGC 



TACCTCA 
ATGGAGT 



CCGCCGA 
GGCGGCT 



AAAGCCT 
TTTCGGA 



TCTCTCA 
AGAGAGT 



TCAAGAT 
AGTTCTA 



TTTCAAG AGGAGAT TGTAGAG TACCATA TTCCGTG 
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AAAATAT CGGAGCC ATAAGAA 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



70 



Sill 
5181 
5251 
5321 
5391 
5461 
5531 
5601 
5671 
5741 
5811 
5881 
5951 

6021 
6091 
6161 
6231 
6301 
6371 
6441 
6511 
6581 
6651 
6721 
6791 
6861 
6931 
7001 
7071 



TTAGGGT 
AATCCCA 



CGTATAT 
GCATATA 



TTTTTCA 
AAAAAGT 



AATATCT 
TTATAGA 



TCATAGT 
AGTATCA 



GATTTTT 
CTAAAAA 



TAAGTTA 
ATTCAAT 



TGTCGTA 
ACAGCAT 



AGTAATT 
TCATTAA 



GTAAAAT 
CATTTTA 



AATTATG 
TTAATAC 



TTTGTAG 
AAACATC 



ATAACAT 
TATTGTA 



CGAATCC ATTGTCC 
GCTTAGG TAACAGG 

GTAAGCC ATCTTGA 
CATTCGG TAGAACT 

TTTTCTT CATGATT 
AAAAGAA GTACTAA 

GAAACAC ATACATA 
CTTTGTG TATGTAT 

TTACAAA TTOGCAG 
AATGTTT AAGCGTC 

GTATAAT ATAACTG 
CATATTA TATTGAC 

CATCTGT CAAATCC 
GTAGACA GTTTAGG 

CCCAATT ATCATGA 
GGGTTAA TAGTACT 

GTAAAGA GTATACG 
CATTXCT CATATGC 

AATTACG ATATTAC 
TTAATGC TATAATG 

TTTATTT GTGTATA 
AAATAAA CACATAT 

TATTTGA ATCCTTT 
ATAAACT TAGGAAA 

TTAACAT TCAGAAT 
AATTGTA AGTCTTA 



CTTGTAA TGTOGGT 

Right Arm 
AAAAACC TATTTAG 
TTTTTGG ATAAATC 

Right Ann 
ATGTATA ATTTTGT 
TACATAT TAAAACA 

Right Aim 
AATATAG TTTACGG 
TTATATC AAATGCC 

Right Arm 
AAACATG GAAGAAT 
TTTGTAC CTTCTTA 

Right Arm 
TAATCTT CATCTTT 
ATTAGAA GTAGAAA 

Right Arm 
GTATCCT ATCTTCC 
CATAGGA TAGAAGG 

Right Arm 
ATCTTTC CAACTGA 
TAGAAAG GTTGACT 

Right Arm 
CAAGATT CTCTTAA 
GTTCTAA GAGAATT 

Right Arm 
ATAACAG TATAGAT 
TATTGTC ATATCTA 

Right Arm 
ATTTCCT TTTATTA 
TAAAGGA AAATAAT 

Right Arm 
TTTAAAG CGTCGTT 
AAATTTC GCAGCAA 

Right Arm 
CTTTAAA TGGATTA 
GAAATTT ACCTAAT 

Right Arm 
TGCGGCC GCAATTC 
ACGCCGG CGTTAAG 



AAAGTTC TCCTCTA ACATCTC ATGGTAT AAGGCAC 

AGATGCA TTGTCAT TATCCAT GATAGCC TCACAGA 
TCTACGT AACAGTA ATAGGTA CTATCGG AGTGTCT 

TGTTTTC AACAACC GCTCGTG AACAGCT TCTATAC 
ACAAAAG TTGTTGG CGAGCAC TTGTCGA AGATATG 

AATATAA GTATACA AAAAGTT TATAGTA ATCTCAT 
TTATATT CATATGT TTTTCAA ATATCAT TAGAGTA 

TACACGA TGTOGTT GAGATAA ATGGCTT TTTATTG 
ATGTGCT ACAGCAA CTCTATT TACCGAA AAATAAC 

TACGAAT ATTGCAG AATCTGT TTTATCC AACCAGT 
ATGCTTA TAACGTC TTAGACA AAATAGG TTGGTCA 

GATAGAA TGCTGTT ATTTAAC ATTTTTG CACCTAT 
CTATCTT ACGACAA TAAATTG TAAAAAC GTGGATA 

CTTTATG TAACGAT GCGAAAT AGCATTT ATCACTA 
GAAATAC ATTGCTA CGCTTTA TCGTAAA TAGTGAT 

ATACGTA ATCTTAT TATCTCT TGCATAT TCGTAAT 
TATGCAT TAGAATA ATAGAGA ACGTATA AGCATTA 

ATACACG TGATATA AATATTT AACCCCA TTCCTGA 
TATGTGC ACTATAT TTATAAA TTGGGGT AAGGACT 

TTTTTAT GTTTTAG TTATTTG TTAGGTT ATACAAA 
AAAAATA CAAAATC AATAAAC AATCCAA TATGTTT 

AAGAATA AGCTTAG TTAACAT ATTATCG CTTAGGT 
TTCTTAT TCGAATC AATTGTA TAATAGC GAATCCA 

TTTTTCC AATGCAT ATTTATA GCTTCAT CCAAAGT 
AAAAAGG TTACGTA TAAATAT CGAAGTA GGTTTCA 

AATTCGT AATCATG GTCATAG CTGTTTC CTGTGTG 
TTAAGCA TTAGTAC CAGTATC GACAAAG GACACAC 



AAATTGT 
TTTAACA 
TAATGAG 
ATTACTC 
GCCAGCT 
CGGTCGA 
CTCGCTC 
GAGCGAG 
ATACGGT 
TATGCCA 
GGAACCG 
CCTTGGC 
TCGACGC 
AGCTGCG 
TCCCTCG 
AGGGAGC 
GCGTGGC 
CGCACCG 
CTGTGTG 
GACACAC 
CCGGTAA 
GGCCATT 
GCGGTGC 
CGCCACG 
CGCTCTG 
GCGAGAC 
GGTAGCG 
CCATCGC 
TGATCTT 
ACTAGAA 
ATCAAAA 
TAGTTTT 



Right 
TATCCGC 
ATAGGCG 
TGAGCTA 
ACTCGAT 
GCATTAA 
CGTAATT 
ACTGACT 
TGACTGA 
TATCCAC 
ATAGGTG 
TAAAAAG 
ATTTTTC 
TCAAGTC 
AGTTCAG 
TGCGCTC 
ACGCGAG 
GCTTTCT 
CGAAAGA 
CACGAAC 
GTGCTTG 
GACACGA 
CTGTGCT 
TACAGAG 
ATGTCTC 
CTGAAGC 
GACTTCG 
GTGGTTT 
CACCAAA 
TTCTACG 
AAGATGC 
AGGATCT 
TCCTAGA 



Arm 

TCACAAT 
AGTGTTA 
ACTCACA 
TGAGTGT 
TGAATCG 
ACTTAGC 
CGCTGCG 
GCGACGC 
AGAATCA 
TCTTAGT 
GCCGCGT 
CGGCGCA 
AGAGGTG 
TCTCCAC 
TCCTGTT 
AGGACAA 
CATAGCT 
GTATCGA 
CCCCCGT 
GGGGGCA 
CTTATCG 
GAATAGC 
TTCTTGA 
AAGAACT 
CAGTTAC 
GTCAATG 
TTTTGTT 
AAAACAA 
GGGTCTG 
CCCAGAC 
TCACCTA 
AGTGGAT 



TCCACAC 
AGGTGTG 
TTAATTG 
AATTAAC 
GCCAACG 
CGGTTGC 
CTCGGTC 
GAGCCAG 
GGGGATA 
CCCCTAT 
TGCTGGC 
ACGACCG 
GCGAAAC 
CGCTTTG 
CCGACCC 
GGCTGGG 
CACGCTG 
GTGCGAC 
TCAGCCC 
AGTCGGG 
CCACTGG 
GGTGACC 
AGTGGTG 
TCACCAC 
CTTCGGA 
GAAGCCT 
TGCAAGC 
ACGTTCG 
ACGCTCA 
TGCGAGT 
GATCCTT 
CTAGGAA 



AACATAC 
TTGTATG 
CGTTGCG 
GCAACGC 
CGCGGGG 
GCGCCCC 
GTTCGGC 
CAAGCCG 
ACGCAGG 
TGCGTCC 
GTTTTTC 
CAAAAAG 
CCGACAG 
GGCTGTC 
TGCCGCT 
ACGGCGA 
TAGGTAT 
ATCCATA 
GACCGCT 
CTGGCGA 
CAGCAGC 
GTCGTCG 
GCCTAAC 
CGGATTG 
AAAAGAG 
TTTTCTC 
AGCAGAT 
TCGTCTA 
GTGGAAC 
CACCTTG 
TTAAATT 
AATTTAA 



GAGCCGG 
CTCGGCC 
CTCACTG 
GAGTGAC 
AGAGGCG 
TCTCCGC 
TGCGGCG 
ACGCCGC 
AAAGAAC 
TTTCTTG 
CATAGGC 
GTATCCG 
GACTATA 
CTGATAT 
TACCGGA 
ATGGCCT 
CTCAGTT 
GAGTCAA 
GCGCCTT 
CGCGGAA 
CACTGGT 
GTGACCA 
TACGGCT 
ATGCCGA 
TTGGTAG 
AACCATC 
TACGCGC 
ATGCGCG 
GAAAACT 
CTTTTGA 
AAAAATG 
TTTTTAC 



AAGCATA 
TTCGTAT 
CCOGCTT 
GGGCGAA 
GTTTGCG 
CAAACGC 
AGCGGTA 
TCGCCAT 
ATGTGAG 
TACACTC 
TCCGCCC 
AGGCGGG 
AAGATAC 
TTCTATG 
TACCTGT 
ATGGACA 
CGGTGTA 
GCCACAT 
ATCCGGT 
TAGGCCA 
AACAGGA 
TTGTCCT 
ACACTAG 
TGTGATC 
CTCTTGA 
GAGAACT 
AGAAAAA 
TCTTTTT 
CACGTTA 
GTGCAAT 
AAGTTTT 
TTCAAAA 



AAGTGTA 
TTCACAT 
TCCAGTC 
AGGTCAG 
TATTGGG 
ATAACCC 
TCAGCTC 
AGTCGAG 
CAAAAGG 
GTTTTCC 
CCCTGAC 
GGGACTG 
CAGGCGT 
GTCCGCA 
CCGCCTT 
GGCGGAA 
GGTCGTT 
GCAGCAA 
AACTATC 
TTGATAG 
TTAGCAG 
AATCGTC 
AAGGACA 
TTCCTGT 
TCCGGCA 
AGGCCGT 
AAGGATC 
TTCCTAG 
AGGGATT 
TCCCTAA 
AAATCAA 
TTTAGTT 



AAGCCTG 
TTCGGAC 
GGGAAAC 
CCCTTTG 
CGCTCTT 
GCGAGAA 
ACTCAAA 
TGAGTTT 
CCAGCAA 
GGTCGTT 
GAGCATC 
CTCGTAG 
TTCCCCC 
AAGGGGG 
TCTCCCT 
AGAGGGA 
CGCTCCA 
GCGAGGT 
GTCTTGA 
CAGAACT 
AGCGAGG 
TCGCTCC 
GTATTTG 
CATAAAC 
AACAAAC 
TTGTTTG 
TCAAGAA 
AGTTCTT 
TTGGTCA 
AACCAGT 
TCTAAAG 
AGATTTC 



GGGTGCC 
CCCACGG 
CTGTCGT 
GACAGCA 
CCGCTTC 
GGCGAAG 
GGCGGTA 
CCGCCAT 
AAGGCCA 
TTCCGGT 
ACAAAAA 
TGTTTTT 
TGGAAGC 
ACCTTCG 
TCGGGAA 
AGCCCTT 
AGCTGGG 
TCGACCC 
GTCCAAC 
CAGGTTG 
TATGTAG 
ATACATC 
GTATCTG 
CATAGAC 
CACCGCT 
GTGGCGA 
GATCCTT 
CTAGGAA 
TGAGATT 
ACTCTAA 
TATATAT 
ATATATA 
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GAGTAAA CTTGGTC TGACAGT TACCAAT GCTTAAT CAGTGAG GCACCTA TCTCAGC GATCTGT CTATTTC 
CTCATTT GAACCAG ACTGTCA ATGGTTA CGAATTA GTCACTC CGTGGAT AGAGTCG CXAGACA GATAAAG 



Amp resistance gene 

GTTCATC CATAGTT GCCTGAC TCCCCGT CGTGTAG ATAACTA CGATACG GGAGGGC TTACCAT CTGGCCC 
CAAGTAG GTATCAA CGGACTG AGGGGCA GCACATC TATTGAT GCTATGC CCTCCCG AATGGTA GACCGGG 
Amp resistance gene 

CAGTGCT GCAATGA TACCGCG AGACCCA CGCTCAC CGGCTCC AGATTTA TCAGCAA TAAACCA GCCAGCC 
GTCACGA CGTTACT ATGGCGC TCTGGGT GOGAGTG GCCGAGG TCTAAAT AGTCGTT ATTTGGT CGGTCGG 
Amp resistance gene 

GGAAGGG CCGAGCG CAGAAGT GGTCCTG CAACTTT ATCOGCC TCCATCC AGTCTAT TAATTGT TGCCGGG 
CCTTCCC GGCTCGC GTCTTCA CCAGGAC GTTGAAA TAGGCGG AGGTAGG TCAGATA ATTAACA ACGGCCC 
Amp resistance gene 

AAGCTAG AGTAAGT AGTTCGC CAGTTAA TAGTTTG CGCAACG TTGTTGC CATTGCT ACAGGCA TCGTGGT 
TTCGATC TCATTCA TCAAGOG GTCAATT ATCAAAC GCGTTGC AACAACG GTAACGA TGTCCGT AGCACCA 
Amp resistance gene 

GTCAOGC TCGTCGT TTGGTAT GGCTTCA TTCAGCT CCGGTTC CCAACGA TCAAGGC GAGTTAC ATGATCC 
CAGTGCG AGCAGCA AACCATA CCGAAGT AAGTCGA GGCCAAG GGTTGCT AGTTCCG CTCAATG TACTAGG 
Amp resistance gene 

CCCATGT TGTGCAA AAAAGCG GTTAGCT CCTTCGG TCCTCCG ATCGTTG TCAGAAG TAAGTTG GCCGCAG 
GGGTACA ACACGTT TTTTCGC CAATCGA GGAAGCC AGGAGGC TAGCAAC AGTCTTC ATTCAAC OGGCGTC 
Amp resistance gene 

TGTTATC ACTCATG GTTATGG CAGCACT GCATAAT TCTCTTA CTGTCAT GCCATCC GTAAGAT GCTTTTC 
ACAATAG TGAGTAC CAATACC GTCGTGA CGTATTA AGAGAAT GACAGTA CGGTAGG CATTCTA CGAAAAG 
Amp resistance gene 

TGTGACT GGTGAGT ACTCAAC CAAGTCA TTCTGAG AATAGTG TATGCGG CGACCGA GTTGCTC TTGCCCG 
ACACTGA CCACTCA TGAGTTG GTTCAGT AAGACTC TXATCAC ATACGCC GCTGGCT CAACGAG AACGGGC 
Amp resistance gene 

GCGTCAA TACGGGA TAATAOC GCGCCAC ATAGCAG AACTTTA AAAGTGC TCATCAT TG GAAAA CGTTCTT 
CGCAGTT ATGCCCT ATTATGG CGCGGTG TATCGTC TTGAAAT TTTCACG AGTAGTA ACCTTTT GCAAGAA 
Amp resistance gene 

CGGGGCG AAAACTC TCAAGGA TCTTACC GCTGTTG AGATCCA GTTCGAT GTAACCC ACTCGTG CACCCAA 
GCCCCGC TTTTGAG AGTTCCT AGAATGG CGACAAC TCTAGGT CAAGCTA CATTGGG TGAGCAC GTGGGTT 
Amp resistance gene 

CTGATCT TCAGCAT CTTTTAC TTTCACC AGCGTTT CTGGGTG AGCAAAA ACAGGAA GGCAAAA TGCCGCA 
GACTAGA AGTCGTA GAAAATG AAAGTGG TGGCAAA GACCCAC TCGTTTT TGTCCTT CCGTTTT ACGGCGT 
Amp resistance gene 

AAAAAGG GAATAAG GGCGACA CGGAAAT GTTGAAT ACTCATA CTCTTCC TTTTTCA ATATTAT TGAAGCA 
TTTTTCC CTTATTC CCGCTGT GCCTTTA CAACTTA TGAGTAT GAGAAGG AAAAAGT TATAATA ACTTCGT 



Amp resistance gene 

TTTATCA GGGTTAT TGTCTCA TGAGCGG ATACATA TTTGAAT GTATTTA GAAAAA T AAACAAA TAGGGGT 
AAATAGT CCCAATA ACAGAGT ACTCGCC TATGTAT AAACTTA CATAAAT CTTTTTA TTTGTTT ATCCCCA 
TCCGCGC ACATTTC CCCGAAA AGTGCCA CCTGACG TCTAAGA AACCATT ATTATCA TGACATT AACCTAT 
AGGCGCG TGTAAAG GGGCTTT TCAOGGT GGACTGC AGATTCT TTGGTAA TAATAGT ACTGTAA TTGGATA 
AAAAATA GGCGTAT CAOGAG 
TTTTTAT CCGCATA GTGCTC 
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FIGURE 2A 



10 
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15 mCEA (6D 

mCEA(6D,lst&2nd 



mCEA(6D 
20 mCEA(6D,lst&2nd 



mCEA ( 6D 
mCEA(6D,lst&2nd 

25 

mCEA (6D 
mCEA(6D,lst&2nd 



30 



45 



50 



mCEA { 6D 
mCEA ( 6D , lst&2nd 



35 mCEA(6D 
mCEA ( 6D , lst&2nd 



mCEA ( 6D 
40 mCEA(6D,lst&2nd 



mCEA ( 6D 
mCEA(6D,lst&2nd 



mCEA (6D 
mCEA(6D,lstfic2nd 



mCEA ( 6D 
TnCEA(6D,lst&2nd 



1 50 
ATGGAGTCTC CCTCGGCCCC TCCCCACAGA TGGTGCATCC CCTGGCAGAG 
ATGGAGTCTC CCTCGGCCCC TCCCCACAGA TGGTGCATCC CCTGGCAGAG 

51 100 
GCTCCTGCTC ACAGCCTCAC TTCTAACCTT CTGGAACCCG CCCACCACTG 
GCTCCTGCTC ACAGCCTCAC TTCTAACCTT CTGGAACCCG CCCACCACTG 

101 150 
CCAAGCTCAC TATTGAATCC ACGCCGTTCA ATGTCGCAGA GGGGAAGGAG 
CCAAGCTCAC TATTGAATCC ACGCCGTTCA ATGTCGCAGA GGGGAAGGAG 

151 200 
GTGCTTCTAC TTGTCCACAA TCTGCCCCAG CATCTTTTTG GCTACAGCTG 
GTGCTTCTAC TTGTCCACAA TCTGCCCCAG CATCTTTTTG GCTACAGCTG 

201 250 
GTACAAAGGT GAAAGAGTGG ATGGCAACCG TCAAATTATA GGATATGTAA 
GTACAAAGGT GAAAGAGTGG ATGGCAACCG TCAAATTATA GGATATGTAA 

251 300 
TAGGAACTCA ACAAGCTACC CCAGGGCCCG CATACAGTGG TCGAGAGATA 
TAGGAACTCA ACAAGCTACC CCAGGGCCCG CATACAGTGG TCGAGAGATA 

301 350 
ATATACCCCA ATGCATCCCT GCTGATCCAG AACATCATCC AGAATGACAC 
ATATACCCCA ATGCATCCCT GCTGATCCAG AACATCATCC AGAATGACAC 

351 400 
AGGATTOTAC ACCCTACACG TCATAAAGTC AGATCTTGTG AATGAAGAAG 
AGGATTCTAC ACCCTACACG TCATAAAGTC AGATCTTGTG AATGAAGAAG 

401 450 
CAACTGGCCA GTTCCGGGTA TACCCGGAGC TGCCCAAGCC CTCCATCTCC 
CAACTGGCCA GTTCCGGGTA TACCCGGAAC TCCCTAAGCC TTCTATTAGC 

451 500 
AGCAACAACT CCAAACCCGT GGAGGACAAG GATGCTGTGG CCTTCACCTG 
TCCAATAATA GTAAGCCTGT CGAAGACAAA GATGCCGTCG CTTTTACATG 

501 550 
TGAACCTGAG ACTCAGGACG CAACCTACCT GTGGTGGGTA AACAATCAGA 
CGAGCCCGAA ACTCAAGACG CAACATATCT CTGGTGGGTG AACAACCAGT 

551 600 
GCCTCCCGGT CAGTCCCAGG CTGCAGCTGT CCAATGGCAA CAGGACCCTC 
CCCTGCCTGT GTCC CCTAGA CTCCAACTCA GCAACGGAAA TAGAACTCTG 

601 650 
ACTCTATTCA ATGTCACAAG AAATGACACA GCAAGCTACA AATGTGAAAC 
ACCCTGTTTA ACGTGACCAG GAACGACACA GCAAGCTACA AATGCGAAAC 
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FIGURE 2B 

651 ~ 700 

mCEA(6D) CCAGAACCCA GTGAGTGCCA GGCGCAGTGA TTCAGTCATC CTGAATGTCC 
mCEA(6D, lst&2nd) CCAAAATCCA GTCAGCGCCA GGAGGTCTGA TTCAGTGATT CTCAACGTGC 

5 

701 750 
mCEA(6D) TCTATGGCCC GGATGCCCCC ACCATTTCCC CTCTAAACAC ATCTTACAGA 
mCEA(6D,lst&2nd) TTTACGGACC CGATGCTCCT ACAATCAGCC CTCTAAACAC AAGCTATAGA 

10 751 800 

mCEA(6D) TCAGGGGAAA ATCTGAACCT CTCCTGCCAC GCAGCCTCTA ACCCACCTGC 
mCEA(6D,lst&2nd) TCAGGGGAAA ATCTGAATCT GAGCTGTCAT GCCG CTAGCA ATCCTCCCGC 

801 850 
15 mCEA(6D) ACAGTACTCT TGGTTTGTCA ATGGGACTTT CCAGCAATCC ACCCAAGAGC 

mCEA(6D,lst&2nd) CCAATACAGC TGGTTTGTCA ATGGCACTTT CCAACAGTCC ACCCAGGAAC 

851 ^00 
mCEA{6D) TCTTTATCCC CAACATCACT GTGAATAATA GTGGATCCTA TACGTGCCAA 
20 mCEA(6D,lst&2nd) TGTTCATTCC CAATATTACC GTGAACAATA GTGGATCCTA CACGTGCCAA 

901 950 
mCEA(€D) GCCCATAACT CAGACACTGG CCTCAATAGG ACCACAGTCA CGACGATCAC 
mCEA(6D,lst&2nd) GCTCACAATA GOGACACCGG ACTCAACCGC ACAACCGTGA CGACGATTAC 

25 

951 1000 
mCEA(6D) AGTCTATGAG CCACCCAAAC CCTTCATCAC CAGCAACAAC TCCAACCCCG 
mCEA(6D,lSt&2nd) CGTGTATGAG CCACCAAAAC CATTCATAAC TAGTAACAAT TCTAACCCAG 

30 1001 1050 

mCEA(6D) TGGAGGATGA GGATGCTGTA GCCTTAACCT GTGAACCTGA GATTCAGAAC 
mCEA(6D,lst&2nd) TTGAGGATGA GGACGCAGTT GCATTAACTT GTGAGCCAGA GATTCAAAAT 

1051 1100 
35 raCEA(6D) ACAACCTACC TGTGGTGGGT AAATAATCAG AGCCTCCCGG TCAGTCCCAG 

mCEA(6D,lSt&2nd) ACCACTTATT TATGGTGGGT CAATAACCAA AGTTTGCCGG TTAGCCCACG 

1101 H50 
mCEA (6D) GCTGCAGCTG TCCAATGACA ACAGGACCCT CACTCTACTC AGTGTCACAA 
40 TnCEA(6D,lSt&2nd) CTTGCAGTTG TCTAATGATA ACCGCACATT GACACTCCTG TCCGTTACTC 

1151 1200 
mCEA(6D) GGAATGATGT AGGACCCTAT GAGTGTGGAA TCCAGAACGA. ATTAAGTGTT 
mCEA(6D,lSt&2nd) GCAATGATGT AGGACCTTAT GAGTGTGGCA TTCAGAATGA ATTATCCGTT 

45 

1201 1250 
mCEA(6D) GACCACAGCG ACCCAGTCAT CCTGAATGTC CTCTATGGCC CAGACGACCC 
mCEA(6D,lst&2nd) GATCACTCCG ACCCTGTTAT CCOTAATGTT TTGTATGGCC CAGACGACCC 

50 1251 I 300 

mCEA(6D) CACCATTTCC CCCTCATACA CCTATTACCG TCCAGGGGTG AACCTCAGCC 
mCEA(6D,lst&2nd) AACTATATCT CCATCATACA CCTACTACCG TCCCGGCGTG AACTTGAGCC 
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FIGURE 2C 
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mCEA ( 6D 
mCEA(6D,lst&2nd 



mCEA(6D 
mCEA(6D,lst&2nd 



35 mCEA ( €D 

mCEA(6D, lst&2nd 



mCEA (6D 
40 mCEA(6D,lst&2nd 



mCEA (6D 
mCEA(6D,lst&2nd 



mCEA (6D 
mCEA(6D,lst&2nd 



mCEA ( 6D 
mCEA(6D,lst&2nd; 



1301 1350 
TCTCCTGCCA TGCAGCCTCT AACCCACCTG CACAGTATTC TTGGCTGATT 
TTTCTTGCCA TGCAGCATCC AACCCCCCTG CACAGTACTC CTGGCTGATT 

1351 1400 
GATGGGAACA TCCAGCAACA CACACAAGAG CTCTTTATCT CCAACATCAC 
GATGGAAACA TTCAGCAGCA TACTCAAGAG TTATTTATAA GCAACATAAC 

1401 1450 
TGAGAAGAAC AGCGGACTCT ATACCTGCCA GGCCAATAAC TCAGCCAGTG 
TGAGAAGAAC AGCGGACTCT ATACTTGCCA GGCCAATAAC TCAGCCAGTG 

1451 1500 
GCCACAGCAG GACTACAGTC AAGACAATCA CAGTCTCTGC GGAGCTGCCC 
GTCACAGCAG GACTACAGTT AAAACAATAA CTGTTTCCGC GGAGCTGCCC 

1501 1550 
AAGCCCTCCA TCTCCAGCAA CAACTCCAAA CCCGTGGAGG ACAAGGATGC 
AAGCCCTCCA TCTCCAGCAA CAACTCCAAA CCCGTGGAGG ACAAGGATGC 

1551 1600 
TGTGGCCTTC ACCTGTGAAC CTGAGGCTCA GAACACAACC TACCTGTGGT 
TGTGGCCTTC ACCTGTGAAC CTGAGGCTCA GAACACAACC TACCTGTGGT 

1601 1650 
GGGTAAATGG TCAGAGCCTC CCAGTCAGTC CCAGGCTGCA GCTGTCCAAT 
GGGTAAATGG TCAGAGCCTC CCAGTCAGTC CCAGGCTGCA GCTGTCCAAT 

1651 1700 
GGCAACAGGA CCCTCACTCT ATTCAATGTC ACAAGAAATG ACGCAAGAGC 
GGCAACAGGA CCCTCACTCT ATTCAATGTC ACAAGAAATG ACGCAAGAGC 

1701 1750 
CTATGTATGT GGAATCCAGA ACTCAGTGAG TGCAAACCGC AGTGACCCAG 
CTATGTATGT GGAATCCAGA ACTCAGTGAG TGCAAACCGC AGTGACCCAG 

1751 1800 
TCACCCTGGA TGTCCTCTAT GGGCCGGACA CCCCCATCAT TTCCCCCCCA 
TCACCCTGGA TGTCCTCTAT GGGCCGGACA CCCCCATCAT TTCCCCCCCA 

1801 1850 
GACTCGTCTT ACCTTTCGGG AGCGGACCTC AACCTCTCCT GCCACTCGGC 
GACTCGTCTT ACCTTTCGGG AGCGGACCTC AACCTCTCCT GCCACTCGGC 

1851 1900 
CTCTAACCCA TCCCCGCAGT ATTCTTGGCG TATCAATGGG ATACCGCAGC 
CTCTAACCCA TCCCCGCAGT ATTCTTGGCG TATCAATGGG ATACCGCAGC 

1901 1950 
AACACACACA AGTTCTCTTT ATCGCCAAAA TCACGCCAAA TAATAACGGG 
AACACACACA AGTTCTCTTT ATCGCCAAAA TCACGCCAAA TAATAACGGG 
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FIGURE 2D 



mCEA ( 6D) 
5 mCEA(6D, lst&2nd) 



1951 2000 
ACCTATGCCT GTTTTGTCTC TAACTTGGCT ACTGGCCGCA ATAATTCCAT 
ACCTATGCCT GTTTTGTCTC TAACTTGGCT ACTGGCCGCA ATAATTCCAT 



10 



mCEA ( 6D) 
mCEA(6D,lst&2nd) 



mCEA(6D) 
mCEA(6D,lst&2nd) 



2001 2050 
AGTCAAGAGC ATCACAGTCT CTGCATCTGG AACTTCTCCT GGTCTCTCAG 
AGTCAAGAGC ATCACAGTCT CTGCATCTGG AACTTCTCCT GGTCTCTCAG 

2051 2100 
CTGGGGCCAC TGTCGGCATC ATGATTGGAG TGCTGGTTGG GGTTGCTCTG 
CTGGGGCCAC TGTCGGCATC ATGATTGGAG TGCTGGTTGG GGTTGCTCTG 



15 



mCEA{6D) 
mCEA(6D,lst&2nd) 



2101 

ATATAG 

ATATAG 
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FIGURE 3 

A. Amino Acid Sequence Comparison of "Wild-Type KSA" (1) and Modified KSA (2) 

5 1 MAPPQVLAFGLLLAAATATFAAAQEEOT^ 
2 MAPPQVIiAFGLLIiAAATATFAAAQEEOT 

1 SKLAAKCLVMKAEMNGSKJiGRRATO 

2 SKLAAKCLVMKAEMNGSKLGRRAKPEGALQNmDGLTO 

10 

1 VNTAGYRRTDKDTEITCSERVRTYWIIIELKHKAREKPYDSKSLRTAIjQKEITTR 

2 VNTAGVRRTDKDTE ITCSERVRTYWI 1 1 ELKHKAREKPYDSKSLRTALQKEITTRYQIjD 

1 PKFI TS I^LYENNVI TIDLVQNS SQK37QNDVDIADVAYYFEKDVKGESLFHSKKMDLTVN 
15 2 PKFITSVLYENOTITIDLVQNSSQKTQ^ 

1 GEQLDLDPGQTLIYYVDEKAPEFSMQGLKAGVIAVIVVVVIAVV^ 

2 GEQLDLDPGQTLIYYVDEKAPEFSMQGLKAGVIAVIV^ 

20 1 KYEKAEI KEMGEMHRELNA 
2 KYEKAE I KEMGEMHRELNA 

B. DNA Sequence of Modified KSA 

atggcgcccccgcaggtcctcgcgttcgggcttctgcttgccgcggcgacggcgacttttgccgcagctcaggaa 
25 gaatgtgtctgtgaaaactacaagctggccgtaaactgctttgtgaataataatcgtcaatgccagtgtacttca 
gttggtgcacaaaatactgtcatttgctcaaagctggctgccaaatgtttggtgatgaaggcagaaatgaatggc 
tcaaaacttgggagaagagcaaaacctgaaggggccctccagaacaatgatgggctttatgatcctgactgcgat 
gagagcgggctctttaaggccaagcagtgcaacggcacctccacgtgctggtgtgtgaacactgctggggtcaga 
agaacagacaaggacactgaaataacctgctctgagcgagtgagaacctactggatcatcattgaactaaaacac 
30 aaagcaagagaaaaaccttatgatagtaaaagtttgcggactgcacttcagaaggagatcacaacgcgttatcaa 
ctggatccaaaatttatcacgagtgt^ttgtatgagaataatgttatcactattgatctggttcaaaattcttct 
caaaaaactcagaatgatgtggacatagctgatgtggcttattattttgaaaaagatgttaaaggtgaatccttg 
tttcattctaagaaaatggacctgacagtaaatggggaacaactggatctggatcctggtcaaactttaatttat 
tatgttgatgaaaaagcacctgaattctcaatgcagggtctaaaagctggtgttattgctgttattgtggttgtg 
35 gtgatagcagttgttgctggaattgttgtgctggttatttccagaaagaagagaatggcaaagtatgagaaggct 
gagataaaggagatgggtgagatgcatagggaactcaatgcataa 
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FIGURE 4A 
Construction of Modified KSA Plasmid 
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FIGURE 4B 
Construction of Modified KSA Plasmid 
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FIGURES 



Plasmid Map of Modified KSA Expression Vector 
H6 Promoter KSAV 
Right Arm Fragment 

Left Arm 




LacZ 




Amp(R) 



CI promoter 




Right Arm 



B. DNA Sequence of Modified KSA Expression Vector 



Promoter H6 for KSAV 


9930-9515 


KSAV 


1-945 


Left arm 


1002-1422 


Right arm 


4070-5590 


Right arm fragment 


9012-9299 



MetAlaProPro GlnValLeu AlaPheGly LeuLeuLeuAla AlaAlaThr- 
1 ATGGCGCCCC CGCAGGTCCT CGCGTTCGGG CTTCTGCTTG CCGCGGCGAC 
10 TACCGCGGGG GCGTCCAGGA GCGCAAGCCC GAAGACGAAC GGCGCCGCTG 

.AlaThrPhe AlaAlaAlaGln GluGluCys ValCysGlu AsnTyrLysLeu- 
51 GGCGACTTTT GCCGCAGCTC AGGAAGAATG TGTCTGTGAA AACTACAAGC 
CCGCTGAAAA CGGCGTCGAG TCCTTCTTAC ACAGACACTT TTGATGTTCG 
. .AlaValAsn CysPheVal AsnAsnAsnArg GlnCysGln CysThrSer 
15 101 TGGCCGTAAA CTGCTTTGTG AATAATAATC GTCAATGCCA GTGTACTTCA 

ACCGGCATTT GACGAAACAC TTATTATTAG CAGTTACGGT CACATGAAGT 
ValGlyAlaGln AsnThrVal IleCysSer LysLeuAlaAla LysCysLeu- 
151 GTTGGTGCAC AAAATACTGT CATTTGCTCA AAGCTGGCTG CCAAATGTTT 
CAACCACGTG TTTTATGACA GTAAACGAGT TTCGACCGAC GGTTTACAAA 
20 .ValMetLys AlaGluMetAsn GlySerLys LeuGlyArg ArgAlaLysPro • 

201 GGTGATGAAG GCAGAAATGA ATGGCTCAAA ACTTGGGAGA AGAGCAAAAC 
CCACTACTTC CGTCTTTACT TACCGAGTTT TGAACCCTCT TCTCGTTTTG 
. .GluGlyAla LeuGlnAsn AsnAspGlyLeu TyrAapPro AspCysAsp 
251 CTGAAGGGGC CCTCCAGAAC AATGATGGGC TTTATGATCC TGACTGCGAT 
25 GACTTCCCCG GGAGGTCTTG TTACTACCCG AAATACTAGG ACTGACGCTA 

GluSerGlyLeu PheLysAla LysGlnCys AsnGlyThrSer ThrCysTrp- 
301 GAGAGCGGGC TCTTTAAGGC CAAGCAGTGC AACGGCACCT CCACGTGCTG 
CTCTCGCCCG AGAAATTCCG GTTCGTCACG TTGCCGTGGA GGTGCACGAC 
.CysValAsn ThrAlaGlyVal ArgArgThr AspLysAsp ThrGluIleThr- 
30 351 GTGTGTGAAC ACTGCTGGGG TCAGAAGAAC AGACAAGGAC ACTGAAATAA 

CACACACTTG TGACGACCCC AGTCTTCTTG TCTGTTCCTG TGACTTTATT 
. .CysSerGlu ArgValArg ThrTyrTrpIle IlelleGlu LeuLysHis 
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401 CCTGCTCT6A GCGAGTGAGA ACCTACTGGA TCATCATTGA ACTAAAACAC 
GGACGAGACT CGCTCACTCT TGGATGACCT AGTAGTAACT TGATTTTGTG 
LysAlaArgGlu LysProTyr AspSerLys SerLeuArgThr AlaLeuGln- 
451 AAAGCAAGAG AAAAACCTTA TGATAGTAAA AGTTTGCGGA CTGCACTTCA 
5 TTTCGTTCTC TTTTTGGAAT ACTATCATTT TCAAACGCCT GACGTGAAGT 

.LysGluIle ThrThrArgTyr GlnLeuAsp ProLysPhe IleThrSerVal- 
501 GAAGGAGATC ACAACGCGTT ATCAACTGGA TCCAAAATTT ATCACGAGTG 
CTTCCTCTAG TGTTGCGCAA TAGTTGACCT AGGTTTTAAA TAGTGCTCAC 
. .LeuTyrGlu AsnAenVal IleThrlleAsp LeuValGln AsnSerSer 
10 551 TGTTGTATGA GAATAATGTT ATCACTATTG ATCTGGTTCA AAATTCTTCT 

ACAACATACT CTTATTACAA TAGTGATAAC TAGACCAAGT TTTAAGAAGA 
GlnLysThrGln AsnAspVal AspIleAla AspValAlaTyr TyrPheGlu- 
601 CAAAAAACTC AGAATGATGT GGACATAGCT GATGTGGCTT ATTATTTTGA 
GTTTTTTGAG TCTTACTACA CCTGTATCGA CTACACCGAA TAATAAAACT 
15 .LysAspVal LysGlyGluSer LeuPheHis SerLysLys MetAspLeuThr- 

651 AAAAGATGTT AAAGGTGAAT CCTTGTTTCA TTCTAAGAAA ATGGACCTGA 
TTTTCTACAA TTTCCACTTA GGAACAAAGT AAGATTCTTT TACCTGGACT 
. .ValAsnGly GluGlnLeu AspLeuAspPro GlyGlnThr LeuIleTyr 
701 CAGTAAATGG GGAACAACTG GATCTGGATC CTGGTCAAAC TTTAATTTAT 
20 GTCATTTACC CCTTGTTGAC CTAGACCTAG GACCAGTTTG AAATTAAATA 

TyrValAspGlu LysAlaPro GluPheSer MetGlnGlyLeu LysAlaGly 
751 TATGTTGATG AAAAAGCACC TGAATTCTCA ATGCAGGGTC TAAAAGCTGG 
ATACAACTAC TTTTTCGTGG ACTTAAGAGT TACGTCCCAG ATTTTCGACC 
.VallleAla VallleValVal ValVallle AlaValVal AlaGlylleVal • 
25 801 TGTTATTGCT GTTATTGTGG TTGTGGTGAT AGCAGTTGTT GCTGGAATTG 

ACAATAACGA CAATAACACC AACACCACTA TCGTCAACAA CGACCTTAAC 
. .ValLeuVal IleSerArg LysI*ysArgMet AlaLysTyr GluLysAla 
851 TTGTGCTGGT TATTTCCAGA AAGAAGAGAA TGGCAAAGTA TGAGAAGGCT 
AACACGACCA ATAAAGGTCT TTCTTCTCTT ACCGTTTCAT ACTCTTCCGA 
30 GluIleLysGlu MetGlyGlu MetHisArg GluLeuAsnAla *** 

901 GAGATAAAGG AGATGGGTGA GATGCATAGG GAACTCAATG CATAAGAAGC 
CTCTATTTCC TCTACCCACT CTACGTATCC CTTGAGTTAC GTATTCTTCG 
951 TTATCGATAC CGTCGACCTC GAGGAATTCT TTTTATTGAT TAACTAGTTA 
AATAGCTATG GCAGCTGGAG CTCCTTAAGA AAAATAACTA ATTGATCAAT 
35 1001 ATCACGGCCG CTTATAAAGA TCTAAAATGC ATAATTTCTA AATAATGAAA 
TAGTGCCGGC GAATATTTCT AGATTTTACG TATTAAAGAT TTATTACTTT 
1051 AAAAAGTACA TCATGAGCAA CGCGTTAGTA TATTTTACAA TGGAGATTAA 
TTTTTCATGT AGTACTCGTT GCGCAATCAT ATAAAATGTT ACCTCTAATT 
1101 CGCTCTATAC CGTTCTATGT TTATTGATTC AGATGATGTT TTAGAAAAGA 
40 GCGAGATATG GCAAGATACA AATAACTAAG TCTACTACAA AATCTTTTCT 

1151 AAGTTATTGA ATATGAAAAC TTTAATGAAG ATGAAGATGA CGACGATGAT 
TTCAATAACT TATACTTTTG AAATTACTTC TACTTCTACT GCTGCTACTA 
1201 TATTGTTGTA AATCTGTTTT AGATGAAGAA GATGACGCGC TAAAGTATAC 
ATAACAACAT TTAGACAAAA TCTACTTCTT CTACTGCGCG ATTTCATATG 
45 1251 TATGGTTACA AAGTATAAGT CTATACTACT AATGGCGACT TGTGCAAGAA 
ATACCAATGT TTCATATTCA GATATGATGA TTACCGCTGA ACACGTTCTT 
1301 GGTATAGTAT AGTGAAAATG TTGTTAGATT ATGATTATGA AAAACCAAAT 
CCATATCATA TCACTTTTAC AACAATCTAA TACTAATACT TTTTGGTTTA 
1351 AAATCAGATC CATATCTAAA GGTATCTCCT TTGCACATAA TTTCATCTAT 
50 TTTAGTCTAG GTATAGATTT CCATAGAGGA AACGTGTATT AAAGTAGATA 

1401 TCCTAGTTTA GAATACCTGC AGCCAAGCTT GGCACTGGCC GTCGTTTTAC 
AGGATCAAAT CTTATGGACG TCGGTTCGAA CCGTGACCGG CAGCAAAATG 
1451 AACGTCGTGA CTGGGAAAAC CCTGGCGTTA CCCAACTTAA TCGCCTTGCA 
TTGCAGCACT GACCCTTTTG GGACCGCAAT GGGTTGAATT AGCGGAACGT 
55 1501 GCACATCCCC CTTTCGCCAG CTGGCGTAAT AGCGAAGAGG CCCGCACCGA 
CGTGTAGGGG GAAAGCGGTC GACCGCATTA TCGCTTCTCC GGGCGTGGCT 
1551 TCGCCCTTCC CAACAGTTGC GCAGCCTGAA TGGCGAATGG CGCCTGATGC 
AGCGGGAAGG GTTGTCAACG CGTCGGACTT ACCGCTTACC GCGGACTACG 
1601 GGTATTTTCT CCTTACGCAT CTGTGCGGTA TTTCACACCG CATATGGTGC 
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35 



40 



45 
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55 







1651 


APTPTPAGTA 




TGAGAGTCAT 


1701 


cpcgppaaca 




GGGCGGTTGT 


1751 


PPG PTTACAG 




GGPGAATGTC 


1 flOl 

iOvl 


1111 VJtvW 1 




AAAAGTGGPA 




V-V- 14-il 111. 1 A 




fiHaTaaAAAT 


1 Q01 


vj 1 wWiv* 111 




pa rrvsTGa a a 


ijji 


JLrUui liv-xrvl 1 




7V •IMIU1171 *TVn*T*'7A A 
A A X iAlbiifiA 


z uux 






/*t/"i TV 7V ^^11 i*PTV '"P*P 




M^T'/^^f lit 1171 I' 




7v nnnpp7i tv*ptv 


zJ.01 


/-I/-17V /~i7\ 7v T\ r*r*r* 




GGTCTTTGCG 


2151 


AGTGGGTTAC 




TCACCCAATG 


2201 


TTCGCCCCGA 




AAGCGGGGCT 


2251 


TGTGGCGCGG 




AGACCGCGCC 


2301 


CCGCATACAC 




GGCGTATGTG 


2351 


AAAAGCATCT 




TTTTCGTAGA 


Z4U1 


Al AAv_(JAl OA 




1A1 iljKj A AL. X 








± v— louL llv. 


o cm 


PTPP PPTTni 




PB nrnna a pt 


Z DDJL 






pTPfi p a PTdT 

UlLuv«nVvXul 


ZOU J. 


a Tra a ptggp 




TaaTTnappn 




pnaTnnaGGP 




ppTa rr^roca 


Z / Ul 


1 1 VSW 1 




pnappnappa 


Z /r> J. 


Pf2f2Ta TPATT 
VyV7ClnlV>nl 1 




pppaTafyraa 


ZoU J. 


tt a t pt a p a p 

1 1A1 V- lALnv. 




aaTanaTGTG 


ZOO JL 


a TPnPTGana 




TAGCGACTCT 


2901 


AGTTTACTCA 




TCAAATGAGT 


2951 


AAAGGATCTA 




TTTCCTAGAT 


3001 


TAACGTGAGT 




ATTGCACTCA 


3051 


AGGATCTTCT 




TCCTAGAAGA 



ggaatgcgta 
caatctgctc 
gttagacgag 
cccgctgacg 

GGGCGACTGC 

acaagctgtg 
tgttcgacac 
catcaccgaa 
gtagtggctt 
taggttaatg 

ATCCAATTAC 
TCGGGGAAAT 
AGCCCCTTTA 
CAAATATGTA 
GTTTATACAT 
TATTGAAAAA 
ATAACTTTTT 
TCCCTTTTTT 
AGGGAAAAAA 
TGGTGAAAGT 
ACCACTTTCA 
ATCGAACTGG 
TAGCTTGACC 
AGAACGTTTT 

tcttgcaaaa 

TATTATCCCG 
ATAATAGGGC 
TATTCTCAGA 
ATAAGAGTCT 
TACGGATGGC 
ATGCCTACCG 
GTGATAACAC 
CACTATTGTG 
GAGCTAACCG 
CTCGATTGGC 
TCGTTGGGAA 
AGCAACCCTT 
CCACGATGCC 
GGTGCTACGG 
GAACTACTTA 
CTTGATGAAT 
GGATaaAGTT 

cctatttcaa 

TTATTGCTGA 
AATAACGACT 
GCAGCACTGG 
CGTCGTGACC 
GACGGGGAGT 
CTGCCCCTCA 
TAGGTGCCTC 
ATCCACGGAG 
TATATACTTT 
ATATATGAAA 
GGTGAAGATC 
CCACTTCTAG 
TTTCGTTCCA 

aaagcaaggt 

TGAGATCCTT 
ACTCTAGGAA 



GACACGCCAT 
TGATGCCGCA 
ACTACGGCGT 
CGCCCTGACG 
GCGGGACTGC 
ACCGTCTCCG 
TGGCAGAGGC 
ACGCGOGAGA 
TGCGCGCTCT 
TCATGATAAT 
AGTACTATTA 
GTGCGCGGAA 
CACGCGCCTT 
TCCGCTCATG 
AGGCGAGTAC 
GGAAGAGTAT 
CCTTCTCATA 
GCGGCATTTT 
CGCCGTAAAA 
AAAAGATGCT 
TTTTCTACGA 
ATCTCAACAG 
TAGAGTTGTC 
CCAATGATGA 
GGTTACTACT 
TATTGACGCC 
ATAACTGCGG 
ATGACTTGGT 
TACTGAACCA 
ATGACAGTAA 
TACTGTCATT 
TGCGGCCAAC 
ACGCCGGTTG 
CTTTTTTGCA 
GAaAAAACGT 
CCGGAGCTGA 
GGCCTCGACT 
TGTAGCAATG 
ACATCGTTAC 
CTCTAGCTTC 
GAGATCGAAG 
GCAGGACCAC 
CGTCCTGGTG 
TAAATCTGGA 
ATTTAGACCT 
GGCCAGATGG 
CCGGTCTACC 
CAGGCAACTA 
GTCCGTTGAT 
ACTGATTAAG 
TGACTAATTC 
AGATTGATTT 
TCTAACTAAA 
CTTTTTGATA 
GAAAAACTAT 
CTGAGCGTCA 
GACTCGCAGT 
TTTTTCTGCG 
AAAAAGACGC 



AAAGTGTGGC 

TAGTTAAGCC 

ATCAATTCGG 

GGCTTGTCTG 

CCGAACAGAC 

GGAGCTGCAT 

CCTCGACGTA 

CGAAAGGGCC 

GCTTTCCCGG 

AATGGTTTCT 

TTACCAAAGA 

CCCCTATTTG 

GGGGATAAAC 

AGACAATAAC 

TCTGTTATTG 

GAGTATTCAA 

CTCATAAGTT 

GCCTTCCTGT 

CGGAAGGACA 

GAAGATCAGT 

CTTCTAGTCA 

CGGTAAGATC 

GCCATTCTAG 

GCACTTTTAA 

CGTGAAAATT 

GGGCAAGAGC 

CCCGraCTCG 

TGAGTACTCA 

ACTCATGAGT 

GAGAATTATG 

CTCTTAATAC 

TTACTTCTGA 

AATGAAGACT 

CAACATGGGG 

GTTGTACCCC 

ATGAAGCCAT 

TACTTCGGTA 

GCAACAACGT 

CGTTGTTGCA 

CCGGCAACAA 

GGCCGTTGTT 

TTCTGCGCTC 

AAGACGCGaG 

GCCGGTGAGC 

CGGCCACTCG 

TAAGCCCTCC 

ATTCGGGAGG 

TGGATGAACG 

acctaottgc 
cattggtaac 
gtaaccattg 
aaaacttcat 

TTTTGAAGTA 
ATCTCATGAC 
TAGAGTACTG 
GACCCCGTAG 
CTGGGGCATC 
CGTAATCTGC 
GCATTAGACG 



GTATACCACG 

AGCCCCGACA 

TCGGGGCTGT 

CTCCCGGCAT 

GAGGGCCGTA 

GTGTCAGAGG 

CACAGTCTCC 

TCGTGATACG 

AGCACTATGC 

TAGACGTCAG 

ATCTGCAGTC 

TTTATTTTTC 

AAATAAAAAG 

CCTGATAAAT 

GGACTATTTA 

CATTTCCGTG 

GTAAAGGCAC 

TTTTGCTCAC 

AAAACGAGTG 

TGGGTGCACG 

ACCCACGTGC 

CTTGAGAGTT 

GAACTCTCAA 

AGTTCTGCTA 

TCAAGACGAT 

AACTCGGTCG 

TTGAGCCAGC 

CCAGTCACAG 

GGTCAGTGTC 

CAGTGCTGCC 

GTCACGACGG 

CAACGATCGG 

GTTGCTAGCC 

GATCATGTAA 

CTAGTACATT 

ACCAAACGAC 

TGGTTTGCTG 

TGCGCAAACT 

ACGCGTTTGA 

TTAATAGACT 

AATTATCTGA 

GGCCCTTCCG 

CCGGGAAGGC 

GTGGGTCTCG 

CACCCAGAGC 

CGTATCGTAG 

GCATAGCATC 

AAATAGACAG 

TTTATCTGTC 

TGTCAGACCA 

ACAGTCTGGT 

TTTTAATTTA 

AAAATTAAAT 

CAAAATCCCT 

GTTTTAGGGA 

AAAAGATCAA 

TTTTCTAGTT 

TGCTTGCAAA 

ACGAACGTTT 
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CAAAAAAACC ACCGCTACCA GCGGTGGTTT GTTTGCCGGA TCAAGAGCTA 
GTTTTTTTGG TGGCGATGGT CGCCACCAAA CAAACGGCCT AGTTCTCGAT 
CCAACTCTTT TTCCGAAGGT AACTGGCTTC AGCAGAGCGC AGATACCAAA 
GGTTGAGAAA AAGGCTTCCA TTGACCGAAG TCGTCTCGCG TCTATGGTTT 
TACTGTCCTT CTAGTGTAGC CGTAGTTAGG CCACCACTTC AAGAACTCTG 
ATGACAGGAA GATCACATCG GCATCAATCC GGTGGTGAAG TTCTTGAGAC 
TAGCACCGCC TACATACCTC GCTCTGCTAA TCCTGTTACC AGTGGCTGCT 
ATCGTGGCGG ATGTATGGAG CGAGACGATT AGGACAATGG TCACCGACGA 
GCCAGTGGCG ATAAGTCGTG TCTTACCGGG TTGGACTCAA GACGATAGTT 
CGGTCACCGC TATTCAGCAC AGAATGGCCC AACCTGAGTT CTGCTATCAA 
ACCGGATAAG GCGCAGCGGT CGGGCTGAAC GGGGGGTTCG TGCACACAGC 
TGGCCTATTC CGCGTCGCCA GCCCGACTTG CCCCCCAAGC ACGTGTGTCG 
CCAGCTTGGA GCGAACGACC TACACCGAAC TGAGATACCT ACAGCGTGAG 
GGTCGAACCT CGCTTGCTGG ATGTGGCTTG ACTCTATGGA TGTCGCACTC 
CTATGAGAAA GCGCCACGCT TCCCGAAGGG AGAAAGGCGG ACAGGTATCC 
GATACTCTTT CGCGGTGCGA AGGGCTTCCC TCTTTCCGCC TGTCCATAGG 
GGTAAGCGGC AGGGTCGGAA CAGGAGAGCG CACGAGGGAG CTTCCAGGGG 
CCATTCGCCG TCCCAGCCTT GTCCTCTCGC GTGCTCCCTC GAAGGTCCCC 
GAAACGCCTG GTATCTTTAT AGTCCTGTCG GGTTTCGCCA CCTCTGACTT 
CTTTGCGGAC CATAGAAATA TCAGGACAGC CCAAAGCGGT GGAGACTGAA 
GAGCGTCGAT TTTTGTGATG CTCGTCAGGG GGGCGGAGCC TATGGAAAAA 
CTCGCAGCTA AAAACACTAC GAGCAGTCCC CCCGCCTCGG ATACCTTTTT 
CGCCAGCAAC GCGGCCTTTT TACGGTTCCT GGCCTTTTGC TGGCCTTTTG 
GCGGTCGTTG CGCCGGAAAA ATGCCAAGGA CCGGAAAACG ACCGGAAAAC 
CTCACATGTT CTTTCCTGCG TTATCCCCTG ATTCTGTGGA TAACCGTATT 
GAGTGTACAA GAAAGGACGC AATAGGGGAC TAAGACACCT ATTGGCATAA 
ACCGCCTTTG AGTGAGCTGA TACCGCTCGC CGCAGGCGAA CGACCGAGCG 
TGGCGGAAAC TCACTCGACT ATGGCGAGCG GCGTCGGCTT GCTGGCTCGC 
CAGCGAGTCA GTGAGCGAGG AAGCGGAAGA GCGCCCAATA CGCAAACCGC 
GTCGCTCAGT CACTCGCTCC TTCGCCTTCT CGCGGGTTAT GCGTTTGGCG 
CTCTCCCCGC GCGTTGGCCG ATTCATTAAT GCAGCTGGCA CGACAGGTTT 
GAGAGGGGCG CGCAACCGGC TAAGTAATTA CGTCGACCGT GCTGTCCAAA 
CCCGACTGGA AAGCGGGCAG TGAGCGCAAC GCAATTAATG TGAGTTAGCT 
GGGCTGACCT TTCGCCCGTC ACTCGCGTTG CGTTAATTAC ACTCAATCGA 
CACTCATTAG QCACCCCAGG CTTTACACTT TATGCTTCCG GCTCGTATGT 
GTGAGTAATC CGTGGGGTCC GAAATGTGAA ATACGAAGGC CGAGCATACA 
TGTGTGGAAT TGTGAGCGGA TAACAATTTC ACACAGGAAA CAGCTATGAC 
ACACACCTTA ACACTCX3CCT ATTGTTAAAG TGTGTCCTTT GTCGATACTG 
CATGATTACG AATTGAATTG CGGCCGCAAT TCTGAATGTT AAATGTTATA 
GTACTAATGC TTAACTTAAC GCCGGCGTTA AGACTTACAA TTTACAATAT 
CTTTGGATGA AGCTATAAAT ATGCATTGGA AAAATAATCC ATTTAAAGAA 
GAAAGCTACT TCGATATTTA TACGTAACCT TTTTATTAGG TAAATTTCTT 
AGGATTCAAA TACTACAAAA CCTAAGCGAT AATATGTTAA CTAAGCTTAT 
TCCTAAGTTT ATGATGTTTT GGATTCGCTA TTATACAATT GATTCGAATA 
TCTTAACGAC GCTTTAAATA TACACAAATA AACATAATTT TTGTATAACC 
AGAATTGCTG CGAAATTTAT ATGTGTTTAT TTGTATTAAA AACATATTGG 
TAACAAATAA CTAAAACATA AAAATAATAA AAGGAAATGT AATATCGTAA 
ATTGTTTATT GATTTTGTAT TTTTATTATT TTCCTTTACA TTATAGCATT 
TTATTTTACT CAGGAATGGG GTTAAATATT TATATCACGT GTATATCTAT 
AATAAAATGA GTCCTTACCC GAATTTATAA ATATAGTGCA CATATAGATA 
ACTGTTATCG TATACTCTTT ACAATTACTA TTACGAATAT GCAAGAGATA 
TGACAATAGC ATATGAGAAA TGTTAATGAT AATGCTTATA CGTTCTCTAT 
ATAAGATTAC GTATTTAAGA GAATCTTGTC ATGATAATTG GGTACGACAT 
TATTCTAATG CATAAATTCT CTTAGAACAG TACTATTAAC CCATGCTGTA 
AGTGATAAAT GCTATTTCGC ATCGTTACAT AAAGTCAGTT GGAAAGATGG 
TCACTATTTA CGATAAAGCG TAGCAATGTA TTTCAGTCAA CCTTTCTACC 
ATTTGACAGA TGTAACTTAA TAGGTGCAAA AATGTTAAAT AACAGCATTC 
TAAACTGTCT ACATTGAATT ATCCACGTTT TTACAATTTA TTGTCGTAAG 
TATCGGAAGA TAGGATACCA GTTATATTAT ACAAAAATCA CTGGTTGGAT 
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ATAGCCTTCT ATCCTATGGT CAATATAATA TGTTTTTAGT GACCAACCTA 
AAAACAGATT CTGCAATATT CGTAAAAGAT GAAGATTACT GCGAATTTGT 
TTTTGTCTAA GACGTTATAA GCATTTTCTA CTTCTAATGA CGCTTAAACA 
AAACTATGAC AATAAAAAGC CATTTATCTC AACGACATCG TGTAATTCTT 
TTTGATACTG TTATTTTTCG GTAAATAGAG TTGCTGTAGC ACATTAAGAA 
CCATGTTTTA TGTATGTGTT TCAGATATTA TGAGATTACT ATAAACTTTT 
GGTACAAAAT ACATACACAA AGTCTATAAT ACTCTAATGA TATTTGAAAA 
TGTATACTTA TATTCCGTAA ACTATATTAA TCATGAAGAA AATGAAAAAG 
ACATATGAAT ATAAGGCATT TGATATAATT AGTACTTCTT TTACTTTTTC 
TATAGAAGCT GTTCACGAGC GGTTGTTGAA AACAACAAAA TTATACATTC 
ATATCTTCGA CAAGTGCTCG CCAACAACTT TTGTTGTTTT AATATGTAAG 
AAGATGGCTT ACATATACGT CTGTGAGGCT ATCATGGATA ATGACAATGC 
TTCTACCGAA TGTATATGCA GACACTCCGA TAGTACCTAT TACTGTTACG 
ATCTCTAAAT AGGTTTTTGG ACAATGGATT CGACCCTAAC ACGGAATATG 
TAGAGATTTA TCCAAAAACC TGTTACCTAA GCTGGGATTG TGCCTTATAC 
GTACTCTACA ATCTCCTCTT GAAATGGCTG TAATGTTCAA GAATACCGAG 
CATGAGATGT TAGAGGAGAA CTTTACCGAC ATTACAAGTT CTTATGGCTC 
GCTATAAAAA TCTTGATGAG GTATGGAGCT AAACCTGTAG TTACTGAATG 
CGATATTTTT AGAACTACTC CATACCTCGA TTTGGACATC AATGACTTAC 
CACAACTTCT TGTCTGCATG ATGCGGTGTT GAGAGACGAC TACAAAATAG 
GTGTTGAAGA ACAGACGTAC TACGCCACAA CTCTCTGCTG ATGTTTTATC 
TGAAAGATCT GTTGAAGAAT AACTATGTAA ACAATGTTCT TTACAGCGGA 
AOTTTCTAGA CAACTTCTTA TTGATACATT TGTTACAAGA AATGTCGCCT 
GGCTTTACTC CTTTGTGTTT GGCAGCTTAC CTTAACAAAG TTAATTTGGT 
CCGAAATGAG GAAACACAAA CCGTCGAATG GAATTGTTTC AATTAAACCA 
TAAACTTCTA TTGGCTCATT CGGCGGATGT AGATATTTCA AACACGGATC 
ATTTGAAGAT AACCGAGTAA GCCGCCTACA TCTATAAAGT TTGTGCCTAG 
GGTTAACTCC TCTACATATA GCCGTATCAA ATAAAAATTT AACAATGGTT 
CCAATTGAGG AGATGTATAT CGGCATAGTT TATTTTTAAA TTGTTACCAA 
AAACTTCTAT TGAACAAAGG TGCTGATACT GACTTGCTGG ATAACATGGG 
TTTGAAGATA ACTTGTTTCC ACGACTATGA CTGAACGACC TATTGTACCC 
ATGTACTCOT TTAATGATCG CTGTACAATC TGGAAATATT GAAATATGTA 
TACATGAGGA AATTACTAGC GACATGTTAG ACCTTTATAA CTTTATACAT 
GCACACTACT TAAAAAAAAT AAAATGTCCA GAACTGGGAA AAATTGATCT 
CGTGTGATGA ATTTTTTTTA TTTTACAGGT CTTGACCCTT TTTAACTAGA 
TGCCAGCTGT AATTCATGGT AGAAAAGAAG TGCTCAGGCT ACTTTTCAAC 
ACGGTCGACA TTAAGTACCA TCTTTTCTTC ACGAGTCCGA TGAAAAGTTG 
AAAGGAGCAG ATGTAAACTA CATCTTTGAA AGAAATGGAA AATCATATAC 
TTTCCTCGTC TACATTTGAT GTAGAAACTT TCTTTACCTT TTAGTATATG 
TGTTTTGGAA TTGATTAAAG AAAGTTACTC TGAGACACAA AAGAGGTAGC 
ACAAAACCTT AACTAATTTC TTTCAATGAG ACTCTGTGTT TTCTCCATCG 
TGAAGTGGTA CTCTCAAAGG TACGTGACTA ATTAGCTATA AAAAGGATCC 
ACTTCACCAT GAGAGTTTCC ATGCACTGAT TAATCGATAT TTTTCCTAGG 
TAGAGGATCA TTATTTAACG TAAACTAAAT GGAAAAGCTA TTTACAGGTA 
ATCTCCTAGT AATAAATTGC ATTTGATTTA CCTTTTCGAT AAATGTCCAT 
CATACGGTGT TTTCTGGAAT CAAATGATTC TGATTTTGAG GATTTTATCA 
GTATGCCACA AAAGACCTTA GTTTACTAAG ACTAAAACTC CTAAAATAGT 
ATACAATAAT GACAGTGCTA ACTGGTAAAA AAGAAAGCAA ACAATTATCA 
TATGTTATTA CTGTCACGAT TGACCATTTT TTCTTTCGTT TGTTAATAGT 
TGGCTAACAA TTTTTATTAT ATTTGTAGTA TGCATAGTGG TCTTTACGTT 
ACCGATTGTT AAAAATAATA TAAACATCAT ACGTATCACC AGAAATGCAA 
TCTTTATTTA AAGTTAATGT GTTAAGATTA AATGGAGTAA TTGGATCCCC 
AGAAATAAAT TTCAATTACA CAATTCTAAT TTACCTCATT AACCTAGGGG 
CATCGATGGG GAATTCACTG GCCGTCGTTT TACAACGTCG TGACTGGGAA 
GTAGCTACCC CTTAAGTGAC CGGCAGCAAA ATGTTGCAGC ACTGACCCTT 
AACCCTGGCG TTACCCAACT TAATCGCCTT GCAGCACATC CCCCTTTCGC 
TTGGGACCGC AATGGGTTGA ATTAGCGGAA CGTCGTGTAG GGGGAAAGCG 
CAGCTGGCGT AATAGCGAAG AGGCCCGCAC CGATCGCCCT TCCCAACAGT 
GTCGACCGCA TTATCGCTTC TCCGGGCGTG GCTAGCGGGA AGGGTTGTCA 
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TGCGCAGCCT 
ACGCGTCGGA 
GCGGTGCCGG 
CGCCACGGCC 
CGTCGTCCCC 
GCAGCAGGGG 
CCAACGTAAC 
GGTTGCATTG 
AATCCGACGG 
TTAGGCTGCC 
ACAGGAAGGC 
TGTCCTTCCG 
ATCTGTGGTG 
TAGACACCAC 
CCGTCTGAAT 
GGCAGACTTA 
CGCGGTGATG 
GCGCCACTAC 
ATATGTGGCG 
TATACACCGC 
CCGACTACAC 
GGCTGATGTG 
TTTCAGCCGC 
AAAGTCGGCG 
GTGACTACOT 
CACTGATGGA 
GCCAGCGGCA 
CGGTCGCCGT 
TTATGCCGAT 
AATACGGCTA 
GGAGCGCCGA 
CCTCGCGGCT 
GCCGACGGCA 
CGGCTGCCGT 
GGTGCGGATT 
CCACGCCTAA 
TTCGAGGCGT 
AAGCTCCGCA 
GATGAGGAGA 
CTACTCGTCT 
TAACGCCGTG 
ATTGCGGCAC 
TGTGCGACCG 
ACACGCTGGC 
CACGGCATGG 
GTGCCGTACC 
GGCGATGAGC 
CCGCTACTCG 
CGAGTGTGAT 
GCTCACACTA 
CACGACGCGC 
GTGCTGCGCG 
GCAGTATGAA 
CGTCATACTT 
CGATGTACGC 
GCTACATGCG 
TGGTCCATCA 
ACCAGGTAGT 
CCTTTGCGAA 



GAATGGCGAA 
CTTACCGCTT 
AAAGCTGGCT 
TTTCGACCGA 
TCAAACTGGC 
AGTTTGACCG 
CTATCCCATT 
GATAGGGTAA 
GTTGTTACTC 
CAACAATGAG 
CAGACGCGAA 
GTCTGCGCTT 
CAACGGGCGC 
GTTGCCCGCG 
TTGACCTGAG 
AACTGGACTC 
GTGCTGCGTT 
CACGACGCAA 
GATGAGCGGC 
CTACTCGCCG 
AAATCAGCGA 
TTTAGTCGCT 
GCTGTACTGG 
CGACATGACC 
ACGGGTAACA 
TGCCCATTGT 
CCGCGCCTTT 
GGCGCGGAAA 
CGCGTCACAC 
GCGCAGTGTG 
AATCCCGAAT 
TTAGGGCTTA 
CGCTGATTGA 
GCGACTAACT 
GAAAATGGTC 
CTTTTACCAG 
TAACCGTCAC 
ATTGGCAGTG 
CGATGGTGCA 
GCTACCACGT 
CGCTGTTCGC 
GCGACAAGCG 
CTACGGCCTG 
GATGCCGGAC 
TGCCAATGAA 
ACGGTTACTT 
GAACGCGTAA 
CTTGCGCATT 
CATCTGGTCG 
GTAGACCAGC 
TGTATCGCTG 
ACATAGCGAC 
GGCGGCGGAG 
CCGCCGCCTC 
GCGCGTGGAT 
CGCGCACCTA 
AAAAATGGCT 
TTTTTACCGA 
TACGCCCACG 



TGGCGCTTTG 
ACCGCGAAAC 
GGAGTGCGAT 
CCTCACGCTA 
AGATGCACGG 
TCTACGTGCC 
ACGGTCAATC 
TGCCAGTTAG 
GCTCACATTT 
CGAGTGTAAA 
TTATTTTTGA 
AATAAAAACT 
TGGGTCGGTT 
ACCCAGCCAA 
CGCATTTTTA 
GCGTAAAAAT 
GGAGTGACGG 
CCTCACTGCC 
ATTTTCCGTG 
TAAAAGGCAC 
TTTCCATGTT 
AAAGGTACAA 
AGGCTGAAGT 
TCCGACTTCA 
GTTTCTTTAT 
CAAAGAAATA 
CGGCGGTGAA 
GCCGCCACTT 
TACGTCTGAA 
ATGCAGACTT 
CTCTATCGTG 
GAGATAGCAC 
AGCAGAAGCC 
TCGTCTTCGG 
TGCTGCTGCT 
ACGACGACGA 
GAGCATCATC 
CTCGTAGTAG 
GGATATCCTG 
CCTATAGGAC 
ATTATCCGAA 
TAATAGGCTT 
TATGTGGTGG 
ATACACCACC 
TCGTCTGACC 
AGCAGACTGG 
CGCGAATGGT 
GCGCTTACCA 
CTGGGGAATG 
GACCCCTTAC 
GATCAAATCT 
CTAGTTTAGA 
CCGACACCAC 
GGCTGTGGTG 
GAAGACCAGC 
CTTCTGGTCG 
TTCGCTACCT 
AAGCGATGGA 
CGATGGGTAA 



CCTGGTTTCC 
GGACCAAAGG 
CTTCCTGAGG 
GAAGGACTCC 
TTACGATGCG 
AATGCTACGC 
CGCCGTTTGT 
GCGGCAAACA 
AATGTTGATG 
TTACAACTAC 
TGGCGTTAAC 
ACCGCAATTG 
ACGGCCAGGA 
TGCCGGTCCT 
CGCGCCGGAG 
GCGCGGCCTC 
CAGTTATCTG 
GTCAATAGAC 
ACGTCTCGTT 
TGCAGAGCAA 
GCCACTCGCT 
CGGTGAGCGA 
TCAGATGTGC 
AGTCTACACG 
GGCAGGGTGA 
CCGTCCCACT 
ATTATCGATG 
TAATAGCTAC 
CGTCGAAAAC 
GCAGCTTTTG 
CGGTGGTTGA 
GCCACCAACT 
TGCGATGTCG 
ACGCTACAGC 
GAACGGCAAG 
CTTGCCGTTC 
CTCTGCATGG 
GAGACGTACC 
CTGATGAAGC 
GAOTACTTCG 
CCATCCGCTG 
GGTAGGCGAC 
ATGAAGCCAA 
TACTTCGGTT 
GATGATCCGC 
CTACTAGGCG 
GCAGCGCGAT 
CGTCGCGCTA 
AATCAGGCCA 
TTAGTCCGGT 
GTCGATCCTT 
CAGCTAGGAA 
GGCCACCGAT 
CCGGTGGCTA 
CCTTCCCGGC 
GGAAGGGCCG 
GGAGAGACGC 
CCTCTCTGCG 
CAGTCTTGGC 



GGCACCAGAA 
CCGTGGTCTT 
CCGATACTGT 
GGCTATGACA 
CCCATCTACA 
GGGTAGATGT 
TCCCACGGAG 
AGGGTGCCTC 
AAAGCTGGCT 
TTTCGACCGA 
TCGGCGTTTC 
AGCCGCAAAG 
CAGTCGTTTG 
GTCAGCAAAC 
AAAACCGCCT 
TTTTGGCGGA 
GAAGATCAGG 
CTTCTAGTCC 
GCTGCATAAA 
CGACGTATTT 
TTAATGATGA 
AATTACTACT 
GGCGAGTTGC 
CCGCTCAACG 
AACGCAGGTC 
TTGCGTCCAG 
AGCGTGGTGG 
TCGCACCACC 
CCGAAACTGT 
GGCTTTGACA 
ACTGCACACC 
TGACGTGTGG 
GTTTCCGCGA 
CAAAGGCGCT 
CCGTTGCTGA 
GGCAACGACT 
TCAGGTCATG 
AGTCCAGTAC 
AGAACAACTT 
TCTTGTTGAA 
TGGTACACGC 
ACCATGTGCG 
TATTGAAACC 
ATAACTTTGG 
GCTGGCTACC 
CGACCGATGG 
CGTAATCACC 
GCATTAGTGG 
CGGCGCTAAT 
GCCGCGATTA 
CCCGCCCGGT 
GGGCGGGCCA 
ATTATTTGCC 
TAATAAACGG 
TGTGCCGAAA 
ACACGGCTTT 
GCCCGCTGAT 
CGGGCGACTA 
GGTTTCGCTA 
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GGAAACGCTT 
AATACTGGCA 
TTATGACCGT 
TGGGACTGGG 
ACCCTGACCC 
GTGGTCGGCT 
CACCAGCCGA 
TCTGTATGAA 
AGACATACTT 
ACGGAAGCAA 
TGCCTTCGTT 
AACCATCGAA 
TTGGTAGCTT 
TCCTGCACTG 
AGGACGTGAC 
GTGCCTCTGG 
CACGGAGACC 
ACTACCGCAG 
TGATGGCGTC 
TGCAACCGAA 
ACGTTGGCTT 
CAGCAGTGGC 
GTCGTCACCG 
CCACGCCATC 
GGTGCGGTAG 
TGGGTAATAA 
ACCCATTATT 
ATGTGGATTG 
TACACCTAAC 
CACCCGTGCA 
GTGGGCACGT 
TTGACCCTAA 
AACTGGGATT 
GCCGAAGCAG 
CGGCTTCGTC 
GCTGATTACG 
CGACTAATGC 
TCAGCCGGAA 
AGTCGGCCTT 
GTTGATGTTG 
CAACTACAAC 
GAACTGCCAG 
CTTGACGGTC 
GGCCGCAAGA 
CCGGCGTTCT 
TGGGATCTGC 
ACCCTAGACG 
AAACGGTCTG 
TTTGCCAGAC 
GGCGCGGCGA 
CCGCGCCGCT 
ATGGAAACCA 
TACCTTTGGT 
GAATATCGAC 
CTTATAGCTG 
CGTCAGTATC 
GCAGTCATAG 
TTGGTCTGGT 
AACCAGACCA 



ATGCGGGTGC 
GGCGTTTCGT 
CCGCAAAGCA 
TGGATCAGTC 
ACCTAGTCAG 
TACGGCGGTG 
ATGCCGCCAC 
CGGTCTGGTC 
GCCAGACCAG 
AACACCAGCA 
TTGTGGTCGT 
GTGACCAGCG 
CACTGGTCGC 
GATGGTGGCG 
CTACCACCGC 
ATGTCGCTCC 
TACAGCGAGG 
CCGGAGAGCG 
GGCCTCTCGC 
CGCGACCGCA 
GCGCTGGCGT 
GTCTGGCGGA 
CAGACCGCCT 
CCGCATCTGA 
GGCGTAGACT 
GCGTTGGCAA 
CGCAACCGTT 
GCGATAAAAA 
CGCTATTTTT 
CCGCTGGATA 
GGCGACCTAT 
CX3CCTGGGTC 
GCGGACCCAG 
CGTTGTTGCA 
GCAACAACGT 
ACCGCTCACG 
TGGCGAGTGC 
AACCTACCGG 
TTGGATGGCC 
AAGTGGCGAG 
TTCACCGCTC 
CTGGCGCAGG 
GACCGCGTCC 
AAACTATCCC 
TTTGATAGGG 
CATTGTCAGA 
GTAACAGTCT 
CGCTGCGGGA 
GCGACGCCCT 
CTTCCAGTTC 
GAAGGTCAAG 
GCCATCGCCA 
CGGTAGCGGT 
GGTTTCCATA 
CCAAAGGTAT 
GGCGGAATTC 
CCGCCTTAAG 
GTCAAAAATA 
CAGTTTTTAT 



GCTACCCATT 
CAGTATCCCC 
GTCATAGGGG 
GCTGATTAAA 
CGACTAATTT 
ATTTTGGCGA 
TAAAACCGCT 
TTTGCCGACC 
AAACGGCTGG 
GCAGTTTTTC 
CGTCAAAAAG 
AATACCTGTT 
TTATGGACAA 
CTGGATGGTA 
GACCTACCAT 
ACAAGGTAAA 
TGTTCCATTT 
CCGGGCAACT 
GGCCCGTTGA 
TGGTCAGAAG 
ACCAGTCTTC 
AAACCTCAGT 
TTTGGAGTCA 
CCACCAGCGA 
GGTGGTCGCT 
TTTAACCGCC 
AAATTGGCGG 
ACAACTGCTG 
TGTTGACGAC 
ACGACATTGG 
TGCTGTAACC 
GAACGCTGGA 
CTTGCGACCT 
GTGCACGGCA 
CACGTGCCGT 
CGTGGCAGCA 
GCACCGTCGT 
ATTGATGGTA 
TAACTACCAT 
CGATACACCG 
GCTATGTGGC 
TAGCAGAGCG 
ATCGTCTCGC 
GACCGCCTTA 
CTGGCGGAAT 
CATGTATACC 
GTACATATGG 
CGCGCGAATT 
GCGCGCTTAA 
AACATCAGCC 
TTGTAGTCGG 
TCTGCTGCAC 
AGACGACGTG 
TGGGGATTGG 
ACCCCTAACC 
CAGCTGAGCG 
GTCGACTCGC 
ATAATAACCG 
TATTATTGGC 



GTCAGAACCG 
GTTTACAGGG 
CAAATGTCCC 
TATGATGAAA 
ATAOTACTTT 
TACGCCGAAC 
ATGCGGCTTG 
GCACGCCGCA 
CGTGCGGCGT 
CAGTTCCGTT 
GTCAAGGCAA 
CCGTCATAGC 
GGCAGTATCG 
AGCCGCTGGC 
TCGGCGACCG 
CAGTTGATTG 
GTCAACTAAC 
CTGGCTCACA 
GACCGAGTGT 
CCGGGCACAT 
GGCCCGTGTA 
GTGACGCTCC 
CACTGCGAGG 
AATGGATTTT 
TTACCTAAAA 
AGTCAGGCTT 
TCAGTCCGAA 
ACGCCGCTGC 
TGCGGCGACG 
CGTAAGTGAA 
GCATTCACTT 
AGGCGGCGGG 
TCCGCCGCCC 
GATACACTTG 
CTATGTGAAC 
TCAGGGGAAA 
AGTCCCCTTT 
GTGGTCAAAT 
CACCAGTTTA 
CATCCGGCGC 
GTAGGCCGCG 
GGTAAACTGG 
CCATTTGACC 
CTGCCGCCTG 
GACGGCGGAC 
CCGTACGTCT 
GGCATGCAGA 
GAATTATGGC 
CTTAATACCG 
GCTACAGTCA 
CGATGTCAGT 
GCGGAAGAAG 
CGCCTTCTTC 
TGGCGACGAC 
ACCGCTGCTG 
CCGGTCGCTA 
GGCCAGCGAT 
GGCAGGGGGG 
CCGTCCCCCC 



CCAAAGCGAT 
CGGCTTCGTC 
GCCGAAGCAG 
ACGGCAACCC 
TGCCGTTGGG 
GATCGCCAGT 
CTAGCGGTCA 
TCCAGCGCTG 
AGGTCGCGAC 
TATCCGGGCA 
ATAGGCCCGT 
GATAACGAGC 
CTATTGCTCG 
AAGCGGTGAA 
TTCGCCACTT 
AACTGCCTGA 
TTGACGGACT 
GTACGCGTAG 
CATGCGCATC 
CAGCGCCTGG 
GTCGCGGACC 
CCGCCGCGTC 
GGCGGCGCAG 
TGCATCGAGC 
ACGTAGCTCG 
TCTTTCACAG 
AGAAAGTGTC 
GCGATCAGTT 
CGOTAGTCAA 
GCGACCCGCA 
CGCTGGGCGT 
CCATTACCAG 
GGTAATGGTC 
CTGATGCGGT 
GACTACGCCA 
ACCTTATTTA 
TGGAATAAAT 
GGCGATTACC 
CCGCTAATGG 
GGATTGGCCT 
CCTAACCGGA 
CTCGGATTAG 
GAGCCTAATC 
TTTTGACCGC 
AAAACTGGCG 
TCCCGAGCGA 
AGGGCTCGCT 
CCACACCAGT 
GGTGTGGTCA 
ACAGCAACTG 
TGTCGTTGAC 
GCACATGGCT 
CGTGTACCGA 
TCCTGGAGCC 
AGGACCTCGG 
CCATTACCAG 
GGTAATGGTC 
ATCCGGAGCT 
TAGGCCTCGA 
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9001 TATCGCAGAT CAATGATCGC TGTACAATCT GGAAATATTG AAATATGTAG 
ATAGCGTCTA GTTACTAGCG ACATGTTAGA CCTTTATAAC TTTATACATC 
9051 CACACTACTT AAAAAAAATA AAATGTCCAG AACTGGGAAA AATTGATCTT 

GTGTGATGAA TTTTTTTTAT TTTACAGGTC TTGACCCTTT TTAACTAGAA 
9101 GCCAGCTGTA ATTCATGGTA GAAAAGAAGT GCTCAGGCTA CTTTTCAACA 

CGGTCGACAT TAAGTACCAT CTTTTCTTCA CGAGTCCGAT GAAAAGTTGT 
9151 AAGGAGCAGA TGTAAACTAC ATCTTTGAAA GAAATGGAAA ATCATATACT 

TTCCTCGTCT ACATTTGATG TAGAAACTTT CTTTACCTTT TAGTATATGA 
9201 GTTTTGGAAT TGATTAAAGA AAGTTACTCT GAGACACAAA AGAGGTAGCT 

CAAAACCTTA ACTAATTTCT TTCAATGAGA CTCTGTGTTT TCTCCATCGA 
9251 GAAGTGGTAC TCTCAAAGGT ACGTGACTAA TTAGCTATAA AAAGGATCCG 

CTTCACCATG AGAGTTTCCA TGCACTGATT AATCGATATT TTTCCTAGGC 
9301 GTACCCTCGA GTCTAGAATC GATCCCGGGT TAATTAATTA GTTATTAGAC 

CATGGGAGCT CAGATCTTAG CTAGGGCCCA ATTAATTAAT CAATAATCTG 
9351 AAGGTGAAAA CGAAACTATT TGTAGCTTAA TTAATTAGAG CTTCTTTATT 

TTCCACTTTT GCTTTGATAA ACATCGAATT AATTAATCTC GAAGAAATAA 
9401 CTATACTTAA AAAGTGAAAA TAAATACAAA GGTTCTTGAG GGTTGTGTTA 

GATATGAATT TTTCACTTTT ATTTATGTTT CCAAGAACTC CCAACACAAT 
9451 AATTGAAAGC GAGAAATAAT CATAAATTAT TTCATTATCG CGATATCCGT 

TTAACTTTCG CTCTTTATTA GTATTTAATA AAGTAATAGC GCTATAGGCA 
9501 TAAGTTTCTA TCGTA 

ATTCAAACAT AGCAT 
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FIGURE 6 
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